Back

Analyzing postprandial metabolomics data using multiway models: A simulation study

Li, L.; Yan, S.; Bakker, B. M.; Hoefsloot, H.; Chawes, B.; Horner, D.; Rasmussen, M. A.; Smilde, A. K.; Acar, E.

2022-12-20 systems biology
10.1101/2022.12.19.521154 bioRxiv
Show abstract

BackgroundAnalysis of time-resolved postprandial metabolomics data can improve the understanding of metabolic mechanisms, potentially revealing biomarkers for early diagnosis of metabolic diseases and advancing precision nutrition and medicine. Postprandial metabolomics measurements at several time points from multiple subjects can be arranged as a subjects by metabolites by time points array. Traditional analysis methods are limited in terms of revealing subject groups, related metabolites, and temporal patterns simultaneously from such three-way data. ResultsWe introduce an unsupervised multiway analysis approach based on the CANDECOMP/PARAFAC (CP) model for improved analysis of postpran-dial metabolomics data guided by a simulation study. Because of the lack of ground truth in real data, we generate simulated data using a comprehensive human metabolic model. This allows us to assess the performance of CP models in terms of revealing subject groups and underlying metabolic processes. We study three analysis approaches: analysis of fasting-state data using Principal Component Analysis, T0-corrected data (i.e., data corrected by subtracting fasting-state data) using a CP model and full-dynamic (i.e., full postprandial) data using CP. Through extensive simulations, we demonstrate that CP models capture meaningful and stable patterns from simulated meal challenge data, revealing underlying mechanisms and differences between diseased vs. healthy groups. ConclusionsOur experiments show that it is crucial to analyze both fasting-state and T0-corrected data for understanding metabolic differences among subject groups. Depending on the nature of the subject group structure, the best group separation may be achieved by CP models of T0-corrected or full-dynamic data. This study introduces an improved analysis approach for postprandial metabolomics data while also shedding light on the debate about correcting baseline values in longitudinal data analysis.

Matching journals

The top 6 journals account for 50% of the predicted probability mass.

1
Metabolites
50 papers in training set
Top 0.1%
19.8%
2
Analytical Chemistry
205 papers in training set
Top 0.3%
9.3%
3
Bioinformatics
1061 papers in training set
Top 3%
7.3%
4
Metabolomics
11 papers in training set
Top 0.1%
6.9%
5
BMC Bioinformatics
383 papers in training set
Top 2%
6.4%
6
PLOS ONE
4510 papers in training set
Top 35%
4.0%
50% of probability mass above
7
Scientific Reports
3102 papers in training set
Top 33%
3.7%
8
Bioinformatics Advances
184 papers in training set
Top 1%
3.6%
9
Computational and Structural Biotechnology Journal
216 papers in training set
Top 2%
3.6%
10
PLOS Computational Biology
1633 papers in training set
Top 9%
3.6%
11
Journal of Proteome Research
215 papers in training set
Top 0.9%
2.6%
12
npj Systems Biology and Applications
99 papers in training set
Top 1%
1.7%
13
Frontiers in Molecular Biosciences
100 papers in training set
Top 2%
1.7%
14
npj Digital Medicine
97 papers in training set
Top 2%
1.7%
15
Nature Communications
4913 papers in training set
Top 53%
1.5%
16
mSystems
361 papers in training set
Top 5%
1.4%
17
Journal of Biomedical Informatics
45 papers in training set
Top 1%
1.2%
18
Molecular Omics
21 papers in training set
Top 0.2%
1.0%
19
Analytica Chimica Acta
17 papers in training set
Top 0.4%
1.0%
20
Briefings in Bioinformatics
326 papers in training set
Top 6%
0.9%
21
Frontiers in Nutrition
23 papers in training set
Top 1%
0.8%
22
IEEE Access
31 papers in training set
Top 1%
0.7%
23
SLAS Technology
11 papers in training set
Top 0.3%
0.7%
24
PROTEOMICS
35 papers in training set
Top 1%
0.5%
25
Journal of Chemical Information and Modeling
207 papers in training set
Top 4%
0.5%
26
Computers in Biology and Medicine
120 papers in training set
Top 6%
0.5%