Back

Prioritizing peptides for targeted mass spectrometry experiments using deep learning

Sonthalia, S.; Dasgupta, P.; Hsu, C.; Wen, B.; MacCoss, M. J.; Noble, W. S.

2026-05-26 bioinformatics
10.64898/2026.05.21.727053 bioRxiv
Show abstract

One critical step in any targeted mass spectrometry experiment is selecting, from each protein of interest, a small number of peptides that respond well in the mass spectrometer and can serve as reliable proxies for protein quantification. Existing methods select target peptides either by relying on prior empirical measurements, limiting their applicability to previously observed peptides, or using machine learning to predict peptide behavior from sequence alone. However, current machine learning tools suffer from various limitations, including using detectability as an indirect proxy for intensity, relying on small training sets, or ignoring the precursor charge state. In this study, we introduce Bromo, a transformer-based deep learning model that ranks peptide precursors from a given protein by their relative response, taking charge state into account. Trained on millions of annotated peptide pairs derived from large-scale, publicly available data-independent acquisition mass spectrometry data, Bromo consistently outperforms existing sequence-based methods across diverse, independent datasets. Furthermore, we show that fine-tuning Bromo on experiment-specific data can account for differences in sample preparation, sample matrix, and instrument platform, all of which influence which peptides serve as optimal targets. This adaptability makes Bromo a practical tool for selecting target peptides for selected reaction monitoring and parallel reaction monitoring assay development across a wide range of experimental conditions.

Matching journals

The top 4 journals account for 50% of the predicted probability mass.

1
Analytical Chemistry
205 papers in training set
Top 0.2%
14.4%
2
Bioinformatics
1061 papers in training set
Top 2%
14.4%
3
Journal of Proteome Research
215 papers in training set
Top 0.2%
14.4%
4
PLOS ONE
4510 papers in training set
Top 24%
6.9%
50% of probability mass above
5
Nature Communications
4913 papers in training set
Top 29%
6.4%
6
Journal of the American Society for Mass Spectrometry
33 papers in training set
Top 0.1%
4.9%
7
Molecular & Cellular Proteomics
158 papers in training set
Top 0.5%
4.9%
8
Nature Machine Intelligence
61 papers in training set
Top 0.9%
3.7%
9
PROTEOMICS
35 papers in training set
Top 0.3%
1.9%
10
Communications Biology
886 papers in training set
Top 14%
1.2%
11
Nature Methods
336 papers in training set
Top 5%
1.2%
12
Cell Systems
167 papers in training set
Top 10%
1.0%
13
Advanced Science
249 papers in training set
Top 17%
0.9%
14
Scientific Reports
3102 papers in training set
Top 70%
0.9%
15
Cell Reports Methods
141 papers in training set
Top 4%
0.9%
16
Communications Chemistry
39 papers in training set
Top 0.9%
0.8%
17
ACS Nano
99 papers in training set
Top 4%
0.8%
18
Biophysical Journal
545 papers in training set
Top 5%
0.7%
19
Metabolites
50 papers in training set
Top 1%
0.7%
20
Nature Biotechnology
147 papers in training set
Top 9%
0.6%
21
BMC Bioinformatics
383 papers in training set
Top 8%
0.6%
22
Computational and Structural Biotechnology Journal
216 papers in training set
Top 11%
0.6%
23
mAbs
28 papers in training set
Top 0.4%
0.6%
24
Analytical and Bioanalytical Chemistry
17 papers in training set
Top 0.5%
0.6%
25
Briefings in Bioinformatics
326 papers in training set
Top 7%
0.6%
26
PLOS Computational Biology
1633 papers in training set
Top 27%
0.6%
27
Nature Chemical Biology
104 papers in training set
Top 5%
0.5%
28
Nano Letters
63 papers in training set
Top 4%
0.5%
29
Analytica Chimica Acta
17 papers in training set
Top 0.8%
0.5%