Back

An explainable AI framework for interpretable biological age

Qiu, W.; Chen, H.; Kaeberlein, M.; Lee, S.-I.

2022-10-06 health informatics
10.1101/2022.10.05.22280735 medRxiv
Show abstract

BackgroundAn individuals biological age is a measurement of health status and provides a mechanistic understanding of aging. Age clocks estimate a biological age of an individual based on their various features. Existing clocks have key limitations caused by the undesirable tradeoff between accuracy (i.e., predictive performance for chronological age or mortality, often achieved by complex, black-box models) and interpretability (i.e., the contributions of features to biological age). Here, we present ENABL (ExplaiNAble BioLogical) Age, a computational framework that combines machine learning (ML) models with explainable AI (XAI) methods to accurately estimate biological age with individualized explanations. MethodsTo construct ENABL Age clock, we first need to predict an age-related outcome of interest (e.g., all-cause or cause-specific mortality), and then rescale the predictions nonlinearly to estimate biological age. We trained and evaluated the ENABL Age clock using the UK Biobank (501,366 samples with 825 features) and NHANES 1999-2014 (47,084 samples with 158 features) datasets. To explain the ENABL Age clock, we extended existing XAI methods so we could linearly decompose any individuals ENABL Age into contributing risk factors. To make ENABL Age clock broadly accessible, we developed two versions: (1) ENABL Age-L, which is based on popular blood tests, and (2) ENABL Age-Q, which is based on questionnaire features. Finally, when we created ENABL Age clocks based on predictions of different age-related outcomes, we validated that each one captures sensible, yet disparate aging mechanisms by performing GWAS association analyses. FindingsOur results indicate that ENABL Age clocks successfully separate healthy from unhealthy aging individuals and are stronger predictors of mortality than existing age clocks. We externally validated our results by training ENABL Age clocks on UK Biobank data and testing on NHANES data. The individualized explanations that reveal the contribution of specific features to ENABL Age provide insights into the important features for biological age. Association analysis with risk factors and agingrelated morbidities, and genome-wide association study (GWAS) results on ENABL Age clocks trained on different mortality causes show that each one captures sensible aging mechanisms. InterpretationWe developed and validated a new ML and XAI-based approach to calculate and interpret biological age based on multiple aging mechanisms. Our results show strong mortality prediction power, interpretability, and flexibility. ENABL Age takes a consequential step towards accurate interpretable biological age prediction built with complex, high-performance ML models. Research in context Evidence before this studyBiological age plays an important role to understanding the mechanisms underlying aging. We search PubMed for original articles published in all languages with the terms "biological age" published until June 22, 2022. Most prior studies focus on the first generation of biological age clocks that are designed to predict chronological age. These clocks have weak and variable associations with mortality risk and other aging outcomes. Only a few studies present the second-generation of biological age clocks, which are built directly with aging outcomes. However, these studies use linear models and do not provide individualized explanations. Moreover, previous biological age clocks cannot specify what aging process they capture. Unlike our study, none of the previous studies have combined a complex machine learning (ML) model and an explainable artificial intelligence (XAI) method, which allows us to build biological ages that are both accurate and interpretable. Added value of this studyIn this study, we present ENABL Age, a new approach to estimate and understand biological age that combines complex ML models and XAI method. The ENABL Age approach is designed to measure secondgeneration biological age clocks by directly predicting age-related outcomes. Our results indicate that ENABL Age accurately reflects individual health status. We also introduce two variants of ENABL Age clocks: (1) ENABL Age-L, which takes popular blood tests as inputs (usable by medical professionals), and (2) ENABL Age-Q, which takes questionnaire features as inputs (usable by non-professional healthcare consumers). We extend existing XAI methods to calculate the contributions of input features to ENABL Age estimate in units of years, which makes our biological age clocks more human-interpretable. Our association analysis and GWAS results show that ENABL Age clocks trained on different age-related outcomes can capture different aging mechanisms. Implications of all the available evidenceWe develop and validate a new ML and XAI-based approach to measure and interpret biological age based on multiple aging mechanisms. Our results demonstrate that ENABL age has strong mortality prediction power, is interpretable, and is flexible. ENABL Age takes a consequential step towards applying XAI to interpret biological age models. Its flexibility allows for many future extensions to omics data, even multi-omic data, and multi-task learning.

Matching journals

The top 4 journals account for 50% of the predicted probability mass.

1
GeroScience
97 papers in training set
Top 0.1%
38.1%
2
Frontiers in Aging Neuroscience
67 papers in training set
Top 0.5%
6.4%
3
Nature Communications
4913 papers in training set
Top 35%
4.4%
4
Bioinformatics
1061 papers in training set
Top 5%
4.0%
50% of probability mass above
5
npj Aging
15 papers in training set
Top 0.2%
4.0%
6
Scientific Reports
3102 papers in training set
Top 35%
3.6%
7
Aging Cell
144 papers in training set
Top 1%
3.6%
8
European Journal of Epidemiology
40 papers in training set
Top 0.2%
2.5%
9
Neurobiology of Aging
95 papers in training set
Top 1%
1.9%
10
eLife
5422 papers in training set
Top 41%
1.7%
11
The Journals of Gerontology: Series A
25 papers in training set
Top 0.5%
1.7%
12
Aging
69 papers in training set
Top 1%
1.7%
13
PLOS ONE
4510 papers in training set
Top 54%
1.7%
14
Nature Aging
51 papers in training set
Top 1%
1.0%
15
PLOS Computational Biology
1633 papers in training set
Top 21%
1.0%
16
JAMIA Open
37 papers in training set
Top 1%
1.0%
17
The Journals of Gerontology, Series A: Biological Sciences and Medical Sciences
22 papers in training set
Top 0.3%
0.9%
18
The Journal of Prevention of Alzheimer's Disease
10 papers in training set
Top 0.3%
0.9%
19
Biology Methods and Protocols
53 papers in training set
Top 2%
0.9%
20
Patterns
70 papers in training set
Top 2%
0.9%
21
BMC Medical Research Methodology
43 papers in training set
Top 1%
0.9%
22
Journal of Medical Internet Research
85 papers in training set
Top 4%
0.8%
23
Age and Ageing
27 papers in training set
Top 0.4%
0.8%
24
Frontiers in Artificial Intelligence
18 papers in training set
Top 0.7%
0.8%
25
Bioinformatics Advances
184 papers in training set
Top 5%
0.8%
26
GENETICS
189 papers in training set
Top 1%
0.8%
27
Genetic Epidemiology
46 papers in training set
Top 0.9%
0.7%
28
Clinical Epigenetics
53 papers in training set
Top 1%
0.6%
29
BMJ Open
554 papers in training set
Top 14%
0.5%
30
BMC Bioinformatics
383 papers in training set
Top 8%
0.5%