Back

OAC-PCA: orthogonal adjustment of confounding effects in principal component analysis for metabolomics data mining

Kurata, M.; Yamamoto, H.; Tsugawa, H.

2026-05-25 bioinformatics
10.64898/2026.05.21.726783 bioRxiv
Show abstract

Principal component analysis (PCA) is widely used in mass spectrometry-based metabolomics for exploratory data mining. Statistical testing of loading values can extract metabolite features associated with score patterns, but this approach requires principal components (PCs) to remain orthogonal while loadings are defined as correlation coefficients between PC scores and variables. Adjustment for Confounding PCA (AC-PCA) was previously developed to explore biologically meaningful components from data matrices affected by biological and technical confounders. However, AC-PCA does not simultaneously ensure PC orthogonality and a correlation-coefficient definition of loadings, limiting the statistical interpretation of its loadings. Here, we reformulated AC-PCA as Orthogonal Adjustment for Confounding effects in PCA (OAC-PCA). In OAC-PCA, PCs remain orthogonal, and loadings retain this correlation-coefficient interpretation. These properties enable statistical testing of metabolite associations while accounting for confounding effects.

Matching journals

The top 6 journals account for 50% of the predicted probability mass.

1
Bioinformatics
1061 papers in training set
Top 1%
18.8%
2
Genomics, Proteomics & Bioinformatics
171 papers in training set
Top 0.8%
7.2%
3
Metabolites
50 papers in training set
Top 0.1%
6.9%
4
PLOS ONE
4510 papers in training set
Top 24%
6.9%
5
Nature Communications
4913 papers in training set
Top 29%
6.4%
6
PLOS Computational Biology
1633 papers in training set
Top 6%
6.4%
50% of probability mass above
7
BMC Bioinformatics
383 papers in training set
Top 2%
4.3%
8
Analytical Chemistry
205 papers in training set
Top 0.8%
3.7%
9
Molecular & Cellular Proteomics
158 papers in training set
Top 0.7%
3.6%
10
Journal of Proteome Research
215 papers in training set
Top 0.8%
3.6%
11
Briefings in Bioinformatics
326 papers in training set
Top 2%
3.1%
12
Scientific Reports
3102 papers in training set
Top 52%
1.9%
13
Frontiers in Molecular Biosciences
100 papers in training set
Top 1%
1.8%
14
Advanced Science
249 papers in training set
Top 11%
1.7%
15
Cell Systems
167 papers in training set
Top 7%
1.7%
16
Bioinformatics Advances
184 papers in training set
Top 3%
1.5%
17
Communications Biology
886 papers in training set
Top 12%
1.3%
18
NAR Genomics and Bioinformatics
214 papers in training set
Top 3%
1.2%
19
Computational and Structural Biotechnology Journal
216 papers in training set
Top 7%
1.1%
20
Genome Biology
555 papers in training set
Top 6%
0.9%
21
iScience
1063 papers in training set
Top 26%
0.9%
22
npj Systems Biology and Applications
99 papers in training set
Top 2%
0.8%
23
BMC Genomics
328 papers in training set
Top 5%
0.8%
24
eLife
5422 papers in training set
Top 58%
0.8%
25
Cell Reports Methods
141 papers in training set
Top 6%
0.6%
26
Communications Chemistry
39 papers in training set
Top 2%
0.5%
27
Nature Machine Intelligence
61 papers in training set
Top 4%
0.5%
28
mSystems
361 papers in training set
Top 8%
0.5%
29
Molecular Systems Biology
142 papers in training set
Top 3%
0.5%