Back

Area under the curve quantification outperforms spectral counting in metaproteomics, but matching between runs is detrimental

Awan, A.; Blakeley-Ruiz, A.; Kleiner, M.; Hinzke, T.

2026-04-06 molecular biology
10.64898/2026.04.05.716595 bioRxiv
Show abstract

Metaproteomics enables the functional characterization of microbiomes and host-microbe interactions by detecting and quantifying thousands of proteins. In data-dependent acquisition metaproteomics, protein quantification is commonly performed using either MS1-based area under the curve (AUC) or MS2-based peptide spectral counts (SpC). In AUC quantification, match between runs (MBR) is frequently employed to minimize data sparsity, yet its impact on metaproteomic data remains unclear. Understanding MBRs impact on metaproteomics data is especially important due to the high peak density in the MS1 mass spectra and the potential presence of not only proteins, but even entire organisms, in one sample and their absence in the other, which would complicate accurate feature mapping and transfer. While accurate quantification is essential for deriving meaningful biological inferences from metaproteomic analyses, systematic evaluations of AUC and SpC quantification in metaproteomics remain scarce. In this study, we used defined complex metaproteomic samples to perform a ground truth-based evaluation of AUC and SpC quantification and to determine the impact of MBR on AUC quantification. We found that MBR led to a substantial number of falsely identified proteins in complex samples. Protein identifications from an organism not present in the sample were wrongly transferred from other samples when MBR was used. We found that MBR-free AUC data had a wider dynamic range, higher quantitative accuracy, and more sensitive detection of abundance differences. Significance of the StudyAlthough metaproteomics is increasingly used to advance microbiome research, quantification strategies in metaproteomics are mostly selected based on convention rather than evidence, due to a lack of ground truth-based evaluation of quantification strategies in metaproteomics. Accurate protein quantification is key to deriving meaningful biological inferences from metaproteomic samples, yet it remains challenging due to their high complexity and uneven protein abundances. Here, we used defined metaproteomic samples to evaluate widely used quantification strategies in metaproteomics and to determine the effects of match between runs (MBR) on quantitative accuracy. Based on our findings, MBR adds falsely identified proteins to metaproteomic data. While MBR-free AUC offers a broader dynamic range and higher quantitative accuracy, SpC offers better proteome coverage. With this study, we provide an evidence-based framework for the informed selection of quantification strategies in metaproteomics, and highlight the strengths and limitations of these approaches with respect to proteome coverage, dynamic range, quantitative accuracy, and error propagation. Our findings also have important implications for the biological interpretation of data derived from these strategies and lay the groundwork for future studies validating quantitative approaches in data-independent acquisition workflows.

Matching journals

The top 5 journals account for 50% of the predicted probability mass.

1
mSystems
361 papers in training set
Top 0.1%
35.7%
2
Journal of Proteome Research
215 papers in training set
Top 0.5%
5.1%
3
PLOS Computational Biology
1633 papers in training set
Top 9%
3.8%
4
mSphere
281 papers in training set
Top 1%
3.7%
5
PLOS ONE
4510 papers in training set
Top 37%
3.7%
50% of probability mass above
6
mBio
750 papers in training set
Top 6%
2.5%
7
Microbiology Spectrum
435 papers in training set
Top 1%
2.5%
8
ISME Communications
103 papers in training set
Top 0.8%
2.2%
9
Frontiers in Microbiology
375 papers in training set
Top 4%
2.2%
10
Nature Communications
4913 papers in training set
Top 46%
2.2%
11
The ISME Journal
194 papers in training set
Top 1%
1.9%
12
Microbiome
139 papers in training set
Top 2%
1.8%
13
PeerJ
261 papers in training set
Top 6%
1.8%
14
BMC Genomics
328 papers in training set
Top 2%
1.7%
15
eLife
5422 papers in training set
Top 46%
1.4%
16
Molecular & Cellular Proteomics
158 papers in training set
Top 1%
1.4%
17
Scientific Reports
3102 papers in training set
Top 63%
1.4%
18
Applied and Environmental Microbiology
301 papers in training set
Top 2%
1.3%
19
PROTEOMICS
35 papers in training set
Top 0.5%
1.2%
20
Frontiers in Marine Science
55 papers in training set
Top 0.9%
1.0%
21
Briefings in Bioinformatics
326 papers in training set
Top 6%
0.9%
22
Environmental Microbiome
26 papers in training set
Top 0.4%
0.9%
23
Gut Microbes
70 papers in training set
Top 0.9%
0.8%
24
Journal of Proteomics
27 papers in training set
Top 0.4%
0.8%
25
Microorganisms
101 papers in training set
Top 2%
0.8%
26
iScience
1063 papers in training set
Top 30%
0.8%
27
Molecular Ecology Resources
161 papers in training set
Top 1%
0.8%
28
Bioinformatics
1061 papers in training set
Top 9%
0.8%
29
Methods in Ecology and Evolution
160 papers in training set
Top 2%
0.7%
30
Nature Microbiology
133 papers in training set
Top 4%
0.7%