Background
Despite direct or indirect efforts of proteomic community, the fraction of blind spots on the protein map is still significant. Almost 14% of human master proteins () have no experimental validation up to now. Apparently, proteomics has reached the stage where all easy scores are achieved, and every next protein identification requires more intension and curiosity in expansion of unusual types of biomaterial and/or conditions.
Methodologies
We encompassed omnigeneous data obtained mass spectrometrically by members of Russian Proteomic community on more than 25 types of non-trivial biological samples and cell lines. These data were processed in a uniform manner by three search engines (X!Tandem, MS-GF+, OMSSA) being a part of SearchGUI package. We accompanied MS-data with the results of RNASeq, neXtProt and GPMdb analyses to estimate probability of unique peptide detection and hence the possibility to identify protein.
Findings
The study resulted in detection of 7 missing proteins with two peptides. Moreover, 149 missing proteins were detected with single proteotypic peptide. We analyzed the gene expression levels to suggest feasible targets for further validation of missing and uncertain protein observations, which will fully meet the requirements of the international consortium.
Conclusion
All proteins are on unequal terms from the very beginning: some of them have competitive advantage to be recaptured. Each case of protein identification is unique (and missing protein – especially) and requires custom-tailored approach and criteria of reliabilty. We believe that non-standard methodological and bioinformatical solutions applied to unusual biomaterials will be fruitful sources of unknown fragments of the proteomic puzzle. Taking into account this creative task, we invite the international community to reconsider specific quality requirements for the proteomic data to “missing” and uncertain proteins.