The popularity of TMT proteomic analysis has been increasing with several computational pipelines now being available. Novel methods of data analysis including data imputation and normalisation have been proposed, with the more challenging task typically being that of experiments requiring several runs. We also have a previously published statistical analysis pipeline for multi-run TMT analysis termed TMTPrePro which uses the quantitation provided by ProteomeDiscoverer.
Here we present an internal spike-in evaluation, comprising a carefully designed spike-in experiment containing three 10-plex TMT replicate runs with 2%, 5% and 10% of know ratio of yeast peptides each spiked into a mice brain tissue lysate. The spiked-in TMT samples were fractionated by offline HpH chromatography and analysed using a Q-Exactive-HFX mass spectrometer followed by quantification at the MS2 level. The goal was to comprehensively evaluate all sources of variation introduced from the LC-MS/MS analysis through to the statistical data analysis pipeline, and to provide a useful benchmark dataset.
We validated our quantitative pipeline in two scenarios, one including sample ratios to a common reference in each run, and an alternate scenario including cross-batch normalisation for peak areas. We observed internal reference scaling (IRS) normalisation enabled reduction of batch effect enabling good run-to-run concordance with coefficient of variation below 10%. The results confirmed well known aspects such as ratio compression of up to 28-60%. Furthermore, the benchmarking data enabled us to choose an optimal cut-offs for differential expression, including assessing the need for multiple testing corrections. In our hands, 1.2-fold change and uncorrected p-values provided the best balance of true/false positives.