Back

Expanding Glycopeptide Identification with Match-Between-Glycans in FragPipe

Shen, J.; Polasky, D. A.; Jager, S.; Yu, F.; Heck, A. J. R.; Reiding, K. R.; Nesvizhskii, A. I.

2026-02-19 bioinformatics
10.64898/2026.02.18.706650 bioRxiv
Show abstract

Glycosylation is one of the most important, but also most complex, post-translational modifications of proteins, playing a pivotal role in various pathological processes. Mass spectrometry-based large-scale glycoproteomics analysis offers a powerful approach to explore the fundamental roles of glycosylation in both physiological and pathological contexts. Traditionally, DDA glycopeptide assignment relies on information-dense MS2 spectra, containing sufficient fragmentation information to identify both the peptide and glycan moieties. Achieving this fragmentation can be difficult, especially for low-abundant glycopeptides and/or large, complex glycans. These glycopeptides are often not assigned using current data analysis software, yet they can be of biological relevance. Here, we introduce a method called match-between-glycans (MBG), which expands glycopeptide identification while maintaining the existing glycoproteome analysis workflow. MBG enables expanding the set of identified glycopeptides to include those without MS2 spectra, or with lower quality MS2 spectra, by looking for MS1 signals displaced from other identified glycopeptides by one or multiple monosaccharide unit(s). MBG can also identify glycans not included in the glycan database, such as those containing adducts or modifications, allowing these glycans to be recovered without a drastic expansion of the search space. Combined with target-decoy FDR control, we show this method is capable of accurately expanding glycopeptide identifications and providing a more complete quantitative profile of glycosylation at each glycosite. MBG is fully integrated into the glycoproteomics workflows in FragPipe, allowing seamless, one-click operation.

Matching journals

The top 4 journals account for 50% of the predicted probability mass.

1
Bioinformatics
1061 papers in training set
Top 2%
17.7%
2
Molecular & Cellular Proteomics
158 papers in training set
Top 0.1%
14.9%
3
Journal of Proteome Research
215 papers in training set
Top 0.2%
14.9%
4
Analytical Chemistry
205 papers in training set
Top 0.2%
10.6%
50% of probability mass above
5
Nature Communications
4913 papers in training set
Top 32%
4.9%
6
Journal of the American Society for Mass Spectrometry
33 papers in training set
Top 0.1%
4.0%
7
PLOS ONE
4510 papers in training set
Top 42%
3.1%
8
PROTEOMICS
35 papers in training set
Top 0.3%
2.1%
9
Computational and Structural Biotechnology Journal
216 papers in training set
Top 4%
1.8%
10
Briefings in Bioinformatics
326 papers in training set
Top 4%
1.7%
11
Cell Reports Methods
141 papers in training set
Top 3%
1.3%
12
Communications Biology
886 papers in training set
Top 12%
1.3%
13
Genomics, Proteomics & Bioinformatics
171 papers in training set
Top 4%
1.2%
14
Nature Methods
336 papers in training set
Top 5%
1.2%
15
Advanced Science
249 papers in training set
Top 15%
1.0%
16
iScience
1063 papers in training set
Top 24%
1.0%
17
Nature Machine Intelligence
61 papers in training set
Top 3%
0.9%
18
PLOS Computational Biology
1633 papers in training set
Top 22%
0.9%
19
Metabolites
50 papers in training set
Top 1%
0.8%
20
Frontiers in Plant Science
240 papers in training set
Top 5%
0.8%
21
Scientific Reports
3102 papers in training set
Top 78%
0.7%
22
Cell Systems
167 papers in training set
Top 13%
0.7%
23
Genome Biology
555 papers in training set
Top 9%
0.5%
24
Bioinformatics Advances
184 papers in training set
Top 6%
0.5%