Poster Presentation HUPO 2019 - 18th Human Proteome Organization World Congress

Proteogenomic approach to UTR peptides identification (#477)

Seunghyuk Choi 1 , Shinyeong Ju 2 3 , Jinwon Lee 1 , Seungjin Na 1 , Cheolju Lee 2 4 5 , Eunok Paek 1
  1. Department of Computer Science, Hanyang University, Seoul, Republic of Korea
  2. Center for Theragnosis, Korea Institute of Science and Technology, Seoul 02792, Republic of Korea
  3. Department of Bio-Medical Science and Research Institute for Natural Sciences, Hanyang university, Seoul, Republic of Korea
  4. Division of Bio-Medical Science and Technology, KIST School, Korea University of Science and Technology, Seoul, Republic of Korea
  5. Department of Converging Science and Technology, KHU-KIST, Kyung Hee University, Seoul, Republic of Korea

There have been many reports showing translation of 5’ untranslated regions (5’-UTRs) and 3’ untranslated regions (3’-UTRs), mostly identified by ribosome profiling and/or tandem mass spectrometry (MS/MS) based proteomics. We propose a proteogenomic approach to identify UTR peptides from a MS/MS assay. Firstly, we construct a translated UTR peptide database with an assumption that UTR may be translated due to single nucleotide errors in recognizing START or STOP codon. After that, we apply a multi-stage search strategy (Madar et al., 2018), which is a method of rigorously identifying novel peptides. As a result, we identified 52 5’-UTR peptides and 9 3’-UTR peptides from a H1299 cell line dataset. There was a total of 45 and 9 genes corresponding the 5’-UTR and 3’-UTR peptides, respectively. Almost a half of 45 genes were commonly observed in a previous study. We further decided alternative start codon of 5’-UTR peptides based on codon frequencies, and then estimated the strength of its kozak context. We classified contexts into strong/weak/non-kozak classes (Lee et al., 2012).  The kozak class composition of novel translation initiation sites (TISs) was comparable to that of the annotated translation initiation sites. As for read-through (RT) events at 3’ end of coding region, we identified a translation of 3’-UTR of MDH1 gene, which is consistent with a previous report (Stiebler et al., 2014). Furthermore, we could identify that the stop codon was substituted to tryptophan, which was not detected by Ribo-Seq. Finally, we also validated expression of 29 UTR peptides by MS/MS analysis of synthetic peptides. Among them, 28 UTR peptides were verified. Peptide identification using tUTR DB together with multi-stage strategy could rigorously identify UTR peptides translated due to single nucleotide errors.

  1. Madar, I.H.; Lee, W.; Wang, X.; Ko, S.-I.; Kim, H.; Mun, D.-G.; Zhang, B.; Paek, E.; Lee, S.-W. Comprehensive and sensitive proteogenomics data analysis strategy based on complementary multi-stage database search. International Journal of Mass Spectrometry 2018, 427, 11–19.
  2. Lee, S.; Liu, B.; Lee, S.; Huang, S.-X.; Shen, B.; Qian, S.-B. Global mapping of translation initiation sites in mammalian cells at single-nucleotide resolution. Proc. Natl. Acad. Sci. U. S. A. 2012, 109, E2424–32.
  3. Stiebler, A.C.; Freitag, J.; Schink, K.O.; Stehlik, T.; Tillmann, B.A.M.; Ast, J.; Bölker, M. Ribosomal Readthrough at a Short UGA Stop Codon Context Triggers Dual Localization of Metabolic Enzymes in Fungi and Animals. PLoS Genetics 2014, 10, e1004685.