Poster Presentation HUPO 2019 - 18th Human Proteome Organization World Congress

MSFragger fast and sensitive peptide identification in diverse proteomic datasets (#943)

Fengchao Yu 1 , Guo Ci Teo 1 , Andy T Kong 1 , Felipe V Leprevost 1 , Dmitriy M Avtonomov 1 , Hui-Yin Chang 1 , Daniel Geiszler 1 , Sarah E Haynes 1 , Alexey I Nesvizhskii 1
  1. University of Michigan, Ann Arbor, MICHIGAN, United States

MSFragger is an MS/MS database search tool that uses a fragment ion indexing method to achieve very fast search speeds. We present comparisons showing that MSFragger is the fastest search engine among the five competitors supporting open (mass-tolerant) searching (MODa, PIPI, MSFragger, pFind3, and TagGraph). It also has the highest sensitivity and the lowest error rate. Since its publication in 2017, MSFragger has been gaining wide adoption by many users and in a variety of applications. Our group has continued to improve MSFragger for robustness and versatility. Among these advances, there are a number of key features that significantly improve its performance. These include shifted ions searching, mass calibration, parameter optimization, split database search, and additional ion series (e.g. a, c, x, and z ions). The shifted ions searching takes the effects of unknown modifications into account in scoring, which increases the sensitivity of identifying modified peptides. The mass calibration feature calibrates both precursor and fragment masses, which allows narrower tolerances in searching and reduces the number of false positives (especially in open searches). The parameter optimization feature tunes parameters according to spectral properties and calibrated masses. Owing to fast run time, MSFragger can try different combinations of the parameters to maximize the search results. The split database search feature allows using MSFragger to search very large databases, and to perform nonspecific searches (e.g. for HLA peptides) on computers with limited memory (RAM). We also demonstrate that MSFragger can analyze timsTOF PASEF data with higher sensitivity and significantly shorter run time compared to MaxQuant and PEAKS Studio X. Additional ongoing developments include support for reading raw files (for Thermo Fisher and Bruker instruments), and improved analysis of glycopeptides. MSFragger can be run as a command-line tool, using FragPipe GUI, or as a ProteinDiscover Node (https://msfragger.nesvilab.org/).