Poster Presentation HUPO 2019 - 18th Human Proteome Organization World Congress

Analytical guidelines for co-fractionation mass spectrometry obtained through global profiling of gold standard Saccharomyces cerevisiae protein complexes (#765)

Chi Nam Ignatius Pang 1 , Sara Ballouz 2 , Daniel Weissberger 1 , Loïc M. Thibaut 3 , Joshua J. Hamey 1 , Jesse Gillis 2 , Marc R. Wilkins 1 , Gene Hart-Smith 1
  1. School of Biotechnology and Biomolecular Sciences, The University of New South Wales, Sydney, New South Wales, Australia
  2. Stanley Center for Cognitive Genomics, Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, The United States of America
  3. School of Mathematics and Statistics, The University of New South Wales, Sydney, NSW, Australia

Protein Correlation Profiling (PCP) enables many protein complexes to be identified in single experiments. A typical PCP experiment involves fractionation of endogenous and untagged protein complexes by size or other physiochemical parameters, followed by LC-MS/MS and label-free quantification of each fraction. Proteins in the same intact complex are co-eluted and have highly correlated abundances across fractions. Although PCP can be used to identify intact complexes, the best approaches for the collection and analysis of PCP data remain undefined. This study aims to gain insight into the collection and analysis of PCP data by benchmarking PCP datasets against gold standard complexes in a well-characterized model organism (Saccharomyces cerevisiae). Our analysis of experimental and modelled PCP datasets suggests that using a combination of fractionation methods and combining these results, for example using Fisher’s combined probability test, is more beneficial than using a stand-alone fractionation method to collect the same number of fractions. From benchmarking the effects of 17 correlation metrics on the identification of known complexes, we showed that some metrics (e.g. Spearman correlation) were more effective than others (e.g. mutual information). While PCP identified many complexes observed in traditional experiments (e.g. AP-MS and Y2H), PCP also identified putative novel complexes. To measure the overlap of the PCP datasets with orthogonal gene expression data, we ran EGAD (Ballouz et al. 2017) on an aggregate PCP and gene co-expression network. We find the addition of gene co-expression to PCP data contributed mainly to confident identification of known complexes (e.g. EGAD scores of 0.63 for PCP alone, 0.72 for co-expression alone, and 0.71 for PCP and co-expression). The similarity in performance of EGAD scores suggests that novel complexes within PCP data are rare, and that confirmation of these novel complexes may require orthogonal experimental validation, for example using cross-linking mass spectrometry.

  1. Ballouz S, Weber M, Pavlidis P, Gillis J. (2017) EGAD: ultra-fast functional analysis of gene networks. Bioinformatics. 33(4):612-614.