Data-independent acquisition (DIA) is a powerful technique for deep, proteome-wide profiling. DIA methodologies often rely on sample-specific spectrum libraries from data-dependent acquisition (DDA) experiments. This approach produces high-performance DDA libraries with instrument-specific fragmentation and retention times at the expense of time, sample, and significant offline fractionation effort. Previously we demonstrated a DIA-only workflow that built chromatogram libraries by searching gas-phase fractionated (GPF) DIA runs with PECAN, a FASTA search engine. However, we found success varied due to the lack of fragmentation and elution information, as well as a substantially increased search space. Here we leverage fragmentation prediction by Prosit to generate chromatogram libraries by replacing the retention times and fragmentation in the predicted library with sample- and instrument-specific empirical values found in the six runs.
We benchmarked our workflow in yeast with ten Prosit-predicted libraries at various normalized collision energy (NCE) settings. While spectrum prediction accuracy is highly dependent on NCE and has significant effect when matching to single-injection DIA, GPF-DIA is less sensitive to spectrum library quality and can produce filtered libraries of equal size to a sample-specific 10-fraction DDA library with high-pH reverse-phase at a wide range of NCE settings. We find that the filtered chromatogram libraries have both more accurate fragmentation and retention times than the DDA library because GPF-DIA fragmentation patterns match to wide-window DIA closer than DDA, and GPF does not affect chromatographic interactions with matrix.
We applied our workflow to analyze gametocyte cultures of Plasmodium falciparum, the parasite responsible for 50% of all malaria cases. We were able to detect parasite peptides in up to 1:100 dilution with uninfected red blood cells in wide-window DIA experiments, while maintaining higher quantification accuracy than comparable DDA experiments. In conclusion, our approach to library generation produces high-quality, sample-specific libraries without offline fractionation using only six GPF-DIA runs.