Back

Trustworthy ML/AI for Aging Clocks: Preventing Systematic Prediction Bias in Biological Age Estimation

Lee, H.; Ye, Z.; Yang, Y.; Pan, Y.; Maron, B.; Wang, Z.; Kochunov, P.; Thompson, P.; Hong, L. E.; MA, T.; Chen, C.; Chen, S.

2026-06-01 bioinformatics
10.64898/2026.05.27.728155 bioRxiv
Show abstract

Machine learning (ML)- and artificial intelligence (AI)-based aging clocks are increasingly used to quantify physiological and molecular aging from omics and medical imaging data as distinct from chronological age. Here, we characterize a fundamental but underappreciated computational limitation of commonly used ML/AI regression models: systematic prediction bias and its propagation to downstream association estimates. We demonstrate that systematic prediction bias can distort, and in some cases reverse, biomedical conclusions drawn from aging-clock analyses. For example, it can produce spurious associations suggesting that older predicted brain age is linked to better cognitive performance, or that older epigenetic age is associated with better kidney function. To address this problem, we introduce a principled and broadly applicable ML/AI regression framework based on constrained optimization, ensuring trustworthy aging-clock estimation and biomedical inference.

Matching journals

The top 7 journals account for 50% of the predicted probability mass.

1
Aging Cell
144 papers in training set
Top 0.4%
17.4%
2
Nature Aging
51 papers in training set
Top 0.2%
8.1%
3
Nature Communications
4913 papers in training set
Top 27%
6.8%
4
eLife
5422 papers in training set
Top 13%
6.3%
5
PLOS Computational Biology
1633 papers in training set
Top 6%
6.3%
6
npj Aging
15 papers in training set
Top 0.2%
4.8%
7
Advanced Science
249 papers in training set
Top 6%
3.6%
50% of probability mass above
8
Proceedings of the National Academy of Sciences
2130 papers in training set
Top 21%
3.6%
9
Aging
69 papers in training set
Top 0.8%
3.0%
10
Scientific Reports
3102 papers in training set
Top 44%
2.7%
11
GeroScience
97 papers in training set
Top 0.7%
2.7%
12
Communications Biology
886 papers in training set
Top 5%
2.1%
13
PLOS ONE
4510 papers in training set
Top 50%
1.9%
14
Nature Medicine
117 papers in training set
Top 2%
1.9%
15
Science Advances
1098 papers in training set
Top 15%
1.9%
16
Neurobiology of Aging
95 papers in training set
Top 1%
1.8%
17
Cell Systems
167 papers in training set
Top 7%
1.8%
18
Frontiers in Genetics
197 papers in training set
Top 5%
1.6%
19
Computational and Structural Biotechnology Journal
216 papers in training set
Top 5%
1.5%
20
Cell Reports
1338 papers in training set
Top 31%
0.9%
21
Nature Machine Intelligence
61 papers in training set
Top 3%
0.8%
22
Bioinformatics
1061 papers in training set
Top 10%
0.7%
23
Briefings in Bioinformatics
326 papers in training set
Top 7%
0.7%
24
Human Brain Mapping
295 papers in training set
Top 5%
0.7%
25
Nature
575 papers in training set
Top 17%
0.6%
26
The American Journal of Human Genetics
206 papers in training set
Top 4%
0.6%