Back

Deep phenotyping obesity using EHR data: Promise, Challenges, and Future Directions

Ruan, X.; Lu, S.; Wang, L.; Wen, A.; Murali, S. B.; Liu, H.

2024-12-08 health informatics
10.1101/2024.12.06.24318608 medRxiv
Show abstract

Obesity affects approximately 34% of adults and 15-20% of children and adolescents in the U.S, and poses significant economic and psychosocial burdens. Due to the multifaceted nature of obesity, currently patient responses to any single anti-obesity medication (AOM) vary significantly, highlighting the need for developing approaches to obesity deep phenotyping and associated precision medicine. While recent advancement in classical phenotyping-guided pharmacotherapies have shown clinical value, they are less embraced by healthcare providers within the precision medicine framework, primarily due to their operational complexity and lack of granularity. From this perspective, several recent review articles highlighted the importance of obesity deep phenotyping for personalized precision medicine. In view of the established role of electronic health record (EHR) as an important data source for clinical phenotypings, we offer an in-depth analysis of the commonly available data elements from obesity patients prior to pharmacotherapy. We also experimented with a multi-modal longitudinal deep autoencoder to explore the feasibility, data requirements, clustering patterns, and challenges associated with EHR-based obesity deep phenotyping. Our analysis indicates at least nine clusters, among which five have distinct explainable clinical relevance. Further research within larger independent cohorts to validate the reproducibility, uncover more detailed substructures and corresponding treatment response is warranted. BackgroundObesity affects approximately 40% of adults and 15-20% of children and adolescents in the U.S, and poses significant economic and psychosocial burdens. Currently, patient responses to any single anti-obesity medication (AOM) vary significantly, making obesity deep phenotyping and associated precision medicine important targets of investigation. ObjectiveTo evaluate the potential of EHR as a primary data source for obesity deep phenotyping, we conduct an in-depth analysis of the data elements and quality available from obesity patients prior to pharmacotherapy, and apply a multi-modal longitudinal deep autoencoder to investigate the feasibility, data requirements, clustering patterns, and challenges associated with EHR-based obesity deep phenotyping. MethodsWe analyzed 53,688 pre-AOM periods from 32,969 patients with obesity or overweight who underwent medium- to long-term AOM treatment. A total of 92 lab and vital measurements, along with 79 ICD-derived clinical classifications software (CCS) codes recorded within one year prior to AOM treatment, were used to train a gated recurrent unit with decay based longitudinal autoencoder (GRU-D-AE) to generate dense embeddings for each pre-AOM record. principal component analysis (PCA) and gaussian mixture modeling (GMM) were applied to identify clusters. ResultsOur analysis identified at least nine clusters, with five exhibiting distinct and explainable clinical relevance. Certain clusters show characteristics overlapping with phenotypes from traditional phenotyping strategy. Results from multiple training folds demonstrated stable clustering patterns in two-dimensional space and reproducible clinical significance. However, challenges persist regarding the stability of missing data imputation across folds, maintaining consistency in input features, and effectively visualizing complex diseases in low-dimensional spaces ConclusionIn this proof-of-concept study, we demonstrated longitudinal EHR as a valuable resource for deep phenotyping the pre-AOM period at per patient visit level. Our analysis revealed the presence of clusters with distinct clinical significance, which could have implications in AOM treatment options. Further research using larger, independent cohorts is necessary to validate the reproducibility and clinical relevance of these clusters, uncover more detailed substructures and corresponding AOM treatment responses.

Matching journals

The top 4 journals account for 50% of the predicted probability mass.

1
JAMIA Open
37 papers in training set
Top 0.1%
22.8%
2
Journal of Biomedical Informatics
45 papers in training set
Top 0.1%
14.9%
3
npj Digital Medicine
97 papers in training set
Top 0.4%
10.6%
4
Journal of Medical Internet Research
85 papers in training set
Top 0.7%
6.5%
50% of probability mass above
5
BMC Medical Informatics and Decision Making
39 papers in training set
Top 0.6%
4.4%
6
PLOS ONE
4510 papers in training set
Top 37%
3.7%
7
Scientific Reports
3102 papers in training set
Top 35%
3.6%
8
PLOS Digital Health
91 papers in training set
Top 0.9%
2.8%
9
JMIR Medical Informatics
17 papers in training set
Top 0.5%
2.1%
10
IEEE Journal of Biomedical and Health Informatics
34 papers in training set
Top 0.8%
1.9%
11
Journal of the American Medical Informatics Association
61 papers in training set
Top 1%
1.8%
12
eBioMedicine
130 papers in training set
Top 1%
1.7%
13
Communications Medicine
85 papers in training set
Top 0.2%
1.7%
14
International Journal of Medical Informatics
25 papers in training set
Top 1%
1.2%
15
Nature Communications
4913 papers in training set
Top 59%
0.9%
16
Journal of Personalized Medicine
28 papers in training set
Top 0.9%
0.9%
17
JMIR Public Health and Surveillance
45 papers in training set
Top 3%
0.9%
18
International Journal of Environmental Research and Public Health
124 papers in training set
Top 7%
0.8%
19
BMC Medical Research Methodology
43 papers in training set
Top 1%
0.8%
20
eClinicalMedicine
55 papers in training set
Top 2%
0.7%
21
Expert Systems with Applications
11 papers in training set
Top 0.6%
0.7%
22
Frontiers in Cardiovascular Medicine
49 papers in training set
Top 3%
0.7%
23
Diabetes, Obesity and Metabolism
17 papers in training set
Top 0.6%
0.7%
24
Frontiers in Public Health
140 papers in training set
Top 9%
0.7%
25
Current Developments in Nutrition
15 papers in training set
Top 1.0%
0.7%
26
The Journal of Clinical Endocrinology & Metabolism
35 papers in training set
Top 1%
0.7%
27
The American Journal of Clinical Nutrition
19 papers in training set
Top 0.4%
0.7%
28
Healthcare
16 papers in training set
Top 2%
0.7%
29
Nutrients
64 papers in training set
Top 2%
0.7%
30
The Journal of Pediatrics
15 papers in training set
Top 0.7%
0.7%