Back

The CcpNmr Analysis Simulated Metabolomics Database (CASMDB): An Open-Source Collection of Metabolite Annotation Data for 1D 1H NMR-Based Metabolomics

Hayward, M. W.; Mureddu, L. G.; Thompson, G.; Phelan, M.; Brooksbank, E. J.; Vuister, G. W.

2024-05-05 biochemistry
10.1101/2024.05.05.592402 bioRxiv
Show abstract

Databases are invaluable for the identification of individual metabolites in untargeted metabolomics analyses, providing annotated pure metabolite references that allow for comparisons with experimentally collected mixture samples. Despite the value of an extensive reference database, publicly available databases for NMR-based metabolomics are often incomplete with respect to experimental conditions and derived NMR annotation parameters, such as peak positions. Hence, they are not designed for visualising the reference spectra alongside an experimental sample spectrum of interest, thus limiting the usefulness of the database. As a consequence, researchers have resorted to their own user- or application based database implementations. In this paper we describe the collection, remediation and integration of annotation data from the publicly available HMDB, BRMB and GISMO NMR metabolomics databases to build the CcpNmr Analysis Simulated Metabolomics Database (CASMDB) that contains 1932 unique metabolite entries. This database, in concert with the AnalysisMetabolomics programme, also allows or accurate simulation of spectra at arbitrary field strengths. Together, these tools underpin the visualising of experimental and simulated metabolite references and their usage in 1D 1H NMR-based metabolomics studies.

Matching journals

The top 4 journals account for 50% of the predicted probability mass.

1
Metabolites
50 papers in training set
Top 0.1%
23.1%
2
Analytical Chemistry
205 papers in training set
Top 0.2%
14.7%
3
PLOS ONE
4510 papers in training set
Top 21%
8.6%
4
Scientific Data
174 papers in training set
Top 0.3%
4.5%
50% of probability mass above
5
Analytical Biochemistry
26 papers in training set
Top 0.1%
4.1%
6
Nucleic Acids Research
1128 papers in training set
Top 5%
3.8%
7
Nature Protocols
30 papers in training set
Top 0.1%
3.3%
8
Nature Communications
4913 papers in training set
Top 43%
3.0%
9
Scientific Reports
3102 papers in training set
Top 44%
2.7%
10
PLOS Computational Biology
1633 papers in training set
Top 14%
1.9%
11
SoftwareX
15 papers in training set
Top 0.1%
1.7%
12
Journal of Visualized Experiments
30 papers in training set
Top 0.3%
1.7%
13
Journal of Proteome Research
215 papers in training set
Top 1%
1.3%
14
eLife
5422 papers in training set
Top 48%
1.3%
15
Bioinformatics
1061 papers in training set
Top 8%
1.0%
16
ACS Omega
90 papers in training set
Top 3%
0.8%
17
Frontiers in Molecular Biosciences
100 papers in training set
Top 4%
0.8%
18
Plant Direct
81 papers in training set
Top 2%
0.8%
19
Journal of the American Society for Mass Spectrometry
33 papers in training set
Top 0.5%
0.8%
20
Metabolomics
11 papers in training set
Top 0.4%
0.8%
21
Current Protocols
13 papers in training set
Top 0.2%
0.7%
22
Journal of Natural Products
11 papers in training set
Top 0.4%
0.7%
23
MethodsX
14 papers in training set
Top 0.6%
0.7%
24
Methods
29 papers in training set
Top 0.7%
0.7%
25
Frontiers in Chemistry
14 papers in training set
Top 0.4%
0.7%
26
PLOS Biology
408 papers in training set
Top 22%
0.7%
27
Molecules
37 papers in training set
Top 2%
0.7%
28
Computational and Structural Biotechnology Journal
216 papers in training set
Top 12%
0.5%
29
NMR in Biomedicine
24 papers in training set
Top 0.5%
0.5%