Poster Presentation HUPO 2019 - 18th Human Proteome Organization World Congress

Functional prediction of uPE1 proteins using a “guilt-by-association” approach with the CCLE dataset (#484)

Guillermo Serrano 1 , Elizabeth Guruceaga 1 2 , José González-Gomariz 2 , Fernando J Corrales 3 , Victor Segura 1 2
  1. Platform of Bioinformatics, Cima Universidad de Navarra, Pamplona, I am not in the U.S. or Canada, España
  2. Platform of Bioinformatics, IdiSNA, Pamplona, Spain
  3. Proteomics Unit, Centro Nacional de Biotecnología (CSIC), Madrid, Spain

In the last years, the Chromosome-Centric Human Proteome Project (C-HPP) has focused much of its efforts on the detection of those proteins without experimental evidence (missing proteins) using mass spectrometry-based technologies. However, new goals have now been stablished in order to improve the understanding of the protein roles in the normal cellular processes and human diseases. This is the case of uncharacterized proteins with experimental evidence (uPE1s) because these proteins lack a known function in the cell and need to be studied in detail. One of the most popular methods to predict molecular functions is the “guilt-by-association” approach, especially in the field of transcriptomics where there are a huge number of publicly available experiments. In particular, we used the CCLE dataset that contains more than 1000 RNA-Seq experiments of human cell lines to calculate the correlation of the uPE1 genes with the PE1 genes with known functions. Next, we performed a sample level functional enrichment analysis (SLEA) based on these correlations and the pathways and functions annotated in GO and KEGG. The result consisted in the set of enriched functions and pathways in the PE1 gene sets positively or negatively correlated with each uPE1 gene. This information was represented as a network to infer the functional characterization of the uPE1 proteins in an integrated view: common functions to all of them and specific functions of one or a group of uPE1s. Subsequently, these bioinformatic predictions can be used by the C-HPP research teams as a guidance to design the biological experiments needed to validate novel functions of uncharacterized proteins.