Area under the curve quantification outperforms spectral counting in metaproteomics, but matching between runs is detrimental
Awan, A.; Blakeley-Ruiz, A.; Kleiner, M.; Hinzke, T.
Show abstract
Metaproteomics enables the functional characterization of microbiomes and host-microbe interactions by detecting and quantifying thousands of proteins. In data-dependent acquisition metaproteomics, protein quantification is commonly performed using either MS1-based area under the curve (AUC) or MS2-based peptide spectral counts (SpC). In AUC quantification, match between runs (MBR) is frequently employed to minimize data sparsity, yet its impact on metaproteomic data remains unclear. Understanding MBRs impact on metaproteomics data is especially important due to the high peak density in the MS1 mass spectra and the potential presence of not only proteins, but even entire organisms, in one sample and their absence in the other, which would complicate accurate feature mapping and transfer. While accurate quantification is essential for deriving meaningful biological inferences from metaproteomic analyses, systematic evaluations of AUC and SpC quantification in metaproteomics remain scarce. In this study, we used defined complex metaproteomic samples to perform a ground truth-based evaluation of AUC and SpC quantification and to determine the impact of MBR on AUC quantification. We found that MBR led to a substantial number of falsely identified proteins in complex samples. Protein identifications from an organism not present in the sample were wrongly transferred from other samples when MBR was used. We found that MBR-free AUC data had a wider dynamic range, higher quantitative accuracy, and more sensitive detection of abundance differences. Significance of the StudyAlthough metaproteomics is increasingly used to advance microbiome research, quantification strategies in metaproteomics are mostly selected based on convention rather than evidence, due to a lack of ground truth-based evaluation of quantification strategies in metaproteomics. Accurate protein quantification is key to deriving meaningful biological inferences from metaproteomic samples, yet it remains challenging due to their high complexity and uneven protein abundances. Here, we used defined metaproteomic samples to evaluate widely used quantification strategies in metaproteomics and to determine the effects of match between runs (MBR) on quantitative accuracy. Based on our findings, MBR adds falsely identified proteins to metaproteomic data. While MBR-free AUC offers a broader dynamic range and higher quantitative accuracy, SpC offers better proteome coverage. With this study, we provide an evidence-based framework for the informed selection of quantification strategies in metaproteomics, and highlight the strengths and limitations of these approaches with respect to proteome coverage, dynamic range, quantitative accuracy, and error propagation. Our findings also have important implications for the biological interpretation of data derived from these strategies and lay the groundwork for future studies validating quantitative approaches in data-independent acquisition workflows.
Matching journals
The top 5 journals account for 50% of the predicted probability mass.