Back

Metabolite discovery through global annotation of untargeted metabolomics data

Chen, L.; Lu, W.; Wang, L.; Xing, X.; Teng, X.; Zeng, X.; Muscarella, A. D.; Shen, Y.; Cowan, A. J.; McReynolds, M. R.; Kennedy, B.; Lato, A. M.; Campagna, S. R.; Singh, M.; Rabinowitz, J. D.

2021-01-06 bioinformatics
10.1101/2021.01.06.425569 bioRxiv
Show abstract

Liquid chromatography-high resolution mass spectrometry (LC-MS)-based metabolomics aims to identify and quantitate all metabolites, but most LC-MS peaks remain unidentified. Here, we present a global network optimization approach, NetID, to annotate untargeted LC-MS metabolomics data. The approach aims to generate, for all experimentally observed ion peaks, annotations that match the measured masses, retention times, and (when available) MS/MS fragmentation patterns. Peaks are connected based on mass differences reflecting adducting, fragmentation, isotopes, or feasible biochemical transformations. Global optimization generates a single network linking most observed ion peaks, enhances peak assignment accuracy, and produces chemically-informative peak-peak relationships, including for peaks lacking MS/MS spectra. Applying this approach to yeast and mouse data, we identified five novel metabolites (thiamine derivatives and N-glucosyl-taurine). Isotope tracer studies indicate active flux through these metabolites. Thus, NetID applies existing metabolomic knowledge and global optimization to annotate untargeted metabolomics data, revealing novel metabolites.

Matching journals

The top 4 journals account for 50% of the predicted probability mass.

1
Metabolites
50 papers in training set
Top 0.1%
22.5%
2
Bioinformatics
1061 papers in training set
Top 2%
12.5%
3
Analytical Chemistry
205 papers in training set
Top 0.3%
10.1%
4
Nature Communications
4913 papers in training set
Top 25%
7.2%
50% of probability mass above
5
PLOS Computational Biology
1633 papers in training set
Top 7%
4.9%
6
Bioinformatics Advances
184 papers in training set
Top 1%
3.6%
7
Journal of Proteome Research
215 papers in training set
Top 0.8%
3.1%
8
Molecular & Cellular Proteomics
158 papers in training set
Top 0.7%
3.1%
9
Computational and Structural Biotechnology Journal
216 papers in training set
Top 3%
2.4%
10
BMC Bioinformatics
383 papers in training set
Top 4%
1.9%
11
Briefings in Bioinformatics
326 papers in training set
Top 3%
1.9%
12
PLOS ONE
4510 papers in training set
Top 50%
1.9%
13
Cell Reports Methods
141 papers in training set
Top 2%
1.9%
14
mSystems
361 papers in training set
Top 5%
1.7%
15
Scientific Reports
3102 papers in training set
Top 64%
1.3%
16
Microbiome
139 papers in training set
Top 2%
1.3%
17
Nature Methods
336 papers in training set
Top 5%
1.1%
18
Nature Biotechnology
147 papers in training set
Top 6%
0.9%
19
NAR Genomics and Bioinformatics
214 papers in training set
Top 3%
0.8%
20
npj Systems Biology and Applications
99 papers in training set
Top 2%
0.8%
21
iScience
1063 papers in training set
Top 32%
0.7%
22
Genomics, Proteomics & Bioinformatics
171 papers in training set
Top 6%
0.7%
23
Scientific Data
174 papers in training set
Top 2%
0.7%
24
Communications Biology
886 papers in training set
Top 26%
0.7%
25
Metabolic Engineering
68 papers in training set
Top 0.7%
0.7%
26
Metabolomics
11 papers in training set
Top 0.6%
0.6%
27
Genome Biology
555 papers in training set
Top 8%
0.6%