Back

Deep Longitudinal Clusters of Type 2 Diabetes Pathophysiology and their Risk of Cardiovascular Disease Events and All-Cause Mortality

Varghese, J. S.; Guo, J.; Hua, D.; Hung, T.; Li, Z.; Tang, S.; Patel, S. A.; Ho, J. C.

2026-06-03 endocrinology
10.64898/2026.06.01.26354645 medRxiv
Show abstract

Objective: Despite the complex and non-linear progression of diabetes, its shared pathways with atherosclerotic cardiovascular disease (ASCVD) are conventionally described using models based on single time points. We identified longitudinal diabetes clusters before diagnosis using deep learning and studied their association with ASCVD events and mortality. Methods: We analyzed 157,670 visits from 15,871 adults (25-65 years) without diabetes from four pooled U.S. cohorts (median follow-up: 22 years [IQR: 9-30]). A gated recurrent unit model with decay (GRU-D) was used to predict 1-year risk of diabetes or censoring within 10 years, by learning longitudinal embeddings across 25 clinical characteristics and biomarkers. Parallel Factor Analysis-2 (PARAFAC-2) and Gaussian mixture models (GMM) were used to group longitudinal participant representations as clusters. Landmark time Cox proportional hazards regressions, relative to last observation in the training window, were used to study covariate-adjusted associations of clusters with ASCVD and mortality. Prognostic utility of clusters beyond the PREVENT risk score was assessed using Harrell's C-index. Findings were replicated in a fifth cohort. Results: The analytic sample was aged 49 years [SD: 11], 58% female, and 68% white; 1,202 (8%) developed diabetes within the first 10 years. We identified five clusters (Cluster A to E) that differed in their clinical characteristics over time. Cluster E (46%) had the highest cumulative incidence of diabetes in the study period, followed by Cluster C (40%) and Cluster A (38%). Cluster C, which was defined by older age, high blood pressure, and suboptimal renal function at the first visit, had higher rates of ASCVD (HR: 1.09, 95%CI: 0.98-1.21) and mortality (HR: 1.08, 95%CI: 1.00-1.16), relative to Cluster A despite being similar in age and BMI at the first visit. Relative to Cluster A, all other clusters had similar or lower rates of ASCVD and mortality. We observed substantial cluster effects for three clusters (Clusters C to E), which were based on only two cohorts. The two clusters (Clusters A and B) that included participants from all four cohorts were reproduced in the fifth cohort and showed similar rates of outcomes. Clusters did not improve ASCVD prognosis, relative to a model that included only the PREVENT risk score. Conclusions: Longitudinal clusters reveal substantial heterogeneity in the period before diabetes diagnosis, and their risk for ASCVD and mortality. However, clusters discovered may, in part, be explained by cohort effects from variations in recruitment and visit patterns after recruitment.

Matching journals

The top 6 journals account for 50% of the predicted probability mass.

1
Diabetologia
36 papers in training set
Top 0.1%
23.2%
2
Diabetes Care
12 papers in training set
Top 0.1%
9.4%
3
The Journal of Clinical Endocrinology & Metabolism
35 papers in training set
Top 0.2%
7.4%
4
Diabetes, Obesity and Metabolism
17 papers in training set
Top 0.1%
5.0%
5
BMJ Open Diabetes Research & Care
15 papers in training set
Top 0.2%
4.4%
6
Nature Communications
4913 papers in training set
Top 35%
4.4%
50% of probability mass above
7
Diabetes
53 papers in training set
Top 0.2%
3.7%
8
Nature Medicine
117 papers in training set
Top 1%
3.0%
9
Communications Medicine
85 papers in training set
Top 0.1%
2.7%
10
Scientific Reports
3102 papers in training set
Top 48%
2.1%
11
Cardiovascular Research
33 papers in training set
Top 0.4%
1.9%
12
BMC Medicine
163 papers in training set
Top 3%
1.8%
13
eBioMedicine
130 papers in training set
Top 1%
1.7%
14
eLife
5422 papers in training set
Top 45%
1.5%
15
Frontiers in Endocrinology
53 papers in training set
Top 1%
1.4%
16
npj Digital Medicine
97 papers in training set
Top 3%
1.3%
17
Cell Reports Medicine
140 papers in training set
Top 6%
1.1%
18
PLOS ONE
4510 papers in training set
Top 61%
1.0%
19
JMIR Public Health and Surveillance
45 papers in training set
Top 3%
1.0%
20
PLOS Medicine
98 papers in training set
Top 4%
0.9%
21
BMJ
49 papers in training set
Top 0.9%
0.9%
22
Molecular Metabolism
105 papers in training set
Top 2%
0.8%
23
Alzheimer's & Dementia
143 papers in training set
Top 3%
0.8%
24
Metabolism
14 papers in training set
Top 0.4%
0.8%
25
JAMIA Open
37 papers in training set
Top 1%
0.8%
26
The Journal of Pediatrics
15 papers in training set
Top 0.6%
0.7%
27
Journal of Racial and Ethnic Health Disparities
11 papers in training set
Top 0.5%
0.7%
28
British Journal of General Practice
22 papers in training set
Top 0.6%
0.7%
29
Frontiers in Cardiovascular Medicine
49 papers in training set
Top 3%
0.7%
30
Human Molecular Genetics
130 papers in training set
Top 4%
0.7%