Back

Predicting and Elucidating Peptide Retention Mechanisms with Graph Attention Networks

Kensert, A.; Hruzova, K.; Devreese, R.; Nameni, A.; Declercq, A.; Gabriels, R.; Martens, L.; Bouwmeester, R.; Urban, J.

2026-05-20 bioinformatics
10.64898/2026.05.18.725893 bioRxiv
Show abstract

Liquid chromatography (LC) is a key technology in bottom-up proteomics, separating proteolytic peptides to decrease sample complexity, enhance coverage, and increase the robustness of protein identification and quantification. Although high-resolution mass spectrometry has advanced significantly, comparable progress in LC has lagged, primarily due to a limited understanding of peptide-column interactions. To bridge this knowledge gap, we introduce a novel deep learning model (PeptideGNN) based on a Graph Neural Network (GNN) architecture to model and elucidate peptide behaviors across various separation conditions. Trained to accurately predict peptide retention times on ten diverse proteomic datasets, the model subsequently employed a saliency mapping technique to interpret the underlying retention mechanisms. Our model consistently outperformed existing retention-time predictors across multiple datasets, while the saliency mapping, importantly, revealed insights into peptide-stationary phase interactions, highlighting the effects of neighboring amino acids, post-translational modifications (PTMs), chromato-graphic columns, and mobile phase additives on peptide retention.

Matching journals

The top 6 journals account for 50% of the predicted probability mass.

1
Nature Communications
4913 papers in training set
Top 14%
12.5%
2
Analytical Chemistry
205 papers in training set
Top 0.2%
10.6%
3
Journal of Proteome Research
215 papers in training set
Top 0.3%
10.2%
4
Molecular & Cellular Proteomics
158 papers in training set
Top 0.2%
10.2%
5
Bioinformatics
1061 papers in training set
Top 4%
6.4%
6
Advanced Science
249 papers in training set
Top 4%
4.4%
50% of probability mass above
7
Nature Machine Intelligence
61 papers in training set
Top 0.8%
4.0%
8
PROTEOMICS
35 papers in training set
Top 0.2%
3.6%
9
PLOS ONE
4510 papers in training set
Top 42%
3.1%
10
Communications Biology
886 papers in training set
Top 3%
3.1%
11
Genomics, Proteomics & Bioinformatics
171 papers in training set
Top 3%
1.9%
12
Computational and Structural Biotechnology Journal
216 papers in training set
Top 4%
1.7%
13
Briefings in Bioinformatics
326 papers in training set
Top 4%
1.5%
14
Communications Chemistry
39 papers in training set
Top 0.5%
1.2%
15
Cell Systems
167 papers in training set
Top 9%
1.2%
16
PLOS Computational Biology
1633 papers in training set
Top 21%
1.0%
17
Genome Biology
555 papers in training set
Top 6%
0.9%
18
ACS Nano
99 papers in training set
Top 4%
0.8%
19
Nano Letters
63 papers in training set
Top 3%
0.7%
20
Journal of the American Society for Mass Spectrometry
33 papers in training set
Top 0.6%
0.7%
21
iScience
1063 papers in training set
Top 36%
0.7%
22
Nature Chemical Biology
104 papers in training set
Top 4%
0.7%
23
Proceedings of the National Academy of Sciences
2130 papers in training set
Top 47%
0.7%
24
JACS Au
35 papers in training set
Top 1%
0.5%
25
Small Methods
26 papers in training set
Top 2%
0.5%
26
Scientific Reports
3102 papers in training set
Top 80%
0.5%
27
Science Advances
1098 papers in training set
Top 35%
0.5%
28
mSystems
361 papers in training set
Top 8%
0.5%