Back

Time-to-event modeling with multimodal clinical and genetic features improves risk stratification of liver complications in chronic hepatitis C

Islam, H.; Arian, A.; Franses, J. W.; Ahsan, H.

2026-03-09 health informatics
10.64898/2026.03.06.26347819 medRxiv
Show abstract

Chronic hepatitis C (CHC) remains a leading cause of cirrhosis, hepatocellular carcinoma (HCC), and premature mortality despite effective antiviral therapy, underscoring the need for individualized risk stratification beyond fibrosis stage alone. Using harmonized data from the All of Us Research Program, we developed and internally validated an interpretable multimodal survival framework to predict incident cirrhosis, HCC, and all-cause mortality, explicitly accounting for competing death. Baseline predictors within a {+/-}180-day window around CHC diagnosis included demographics, comorbidities, medications, laboratory biomarkers, socioeconomic context, and selected germline variants. Penalized Cox, ensemble, gradient-boosted, and neural survival models were compared under a consistent training and held-out testing strategy. Best-performing models achieved test C-indices of 0.67 for cirrhosis (Coxnet-LASSO), 0.71 for HCC, and 0.75 for mortality (Random Survival Forest), with stable time-dependent AUROC up to 0.81. Substantial feature compression preserved discrimination: restricting to the top 50% or 25% of predictors resulted in minimal absolute change in test performance (3.5%). Reduced models were anchored in clinically interpretable domains, including age, liver injury markers, hepatic reserve, cardiometabolic burden, deprivation index, and chromosome 19/22 loci. Feature importance reinforces existing known clinical and biological risk factors for liver complications: liver injury markers were most influential for cirrhosis and HCC, whereas hepatic reserve and cardiometabolic burden were more predictive of mortality, with age serving as a central baseline determinant across outcomes. Together, these results support a scalable and parsimonious framework for individualized CHC risk stratification that integrates multimodal determinants.

Matching journals

The top 8 journals account for 50% of the predicted probability mass.

1
Nature Communications
4913 papers in training set
Top 7%
17.9%
2
Journal of Hepatology
18 papers in training set
Top 0.1%
9.0%
3
Cell Reports Medicine
140 papers in training set
Top 0.5%
6.3%
4
Advanced Science
249 papers in training set
Top 6%
3.5%
5
Communications Biology
886 papers in training set
Top 2%
3.5%
6
Med
38 papers in training set
Top 0.1%
3.5%
7
Scientific Reports
3102 papers in training set
Top 39%
3.5%
8
Gut Microbes
70 papers in training set
Top 0.3%
3.5%
50% of probability mass above
9
eBioMedicine
130 papers in training set
Top 0.4%
3.5%
10
Science Translational Medicine
111 papers in training set
Top 1%
2.6%
11
BMC Medicine
163 papers in training set
Top 3%
2.0%
12
Communications Medicine
85 papers in training set
Top 0.1%
2.0%
13
Annals of Internal Medicine
27 papers in training set
Top 0.3%
1.9%
14
Nature Medicine
117 papers in training set
Top 2%
1.9%
15
Nature Biomedical Engineering
42 papers in training set
Top 0.7%
1.8%
16
Patterns
70 papers in training set
Top 0.8%
1.8%
17
European Respiratory Journal
54 papers in training set
Top 1.0%
1.7%
18
npj Digital Medicine
97 papers in training set
Top 2%
1.5%
19
PLOS Computational Biology
1633 papers in training set
Top 18%
1.5%
20
Science Advances
1098 papers in training set
Top 22%
1.3%
21
Frontiers in Immunology
586 papers in training set
Top 5%
1.3%
22
Journal of Infection
71 papers in training set
Top 2%
1.3%
23
Gut
36 papers in training set
Top 0.6%
1.2%
24
Cell Genomics
162 papers in training set
Top 5%
1.1%
25
Hepatology
18 papers in training set
Top 0.3%
0.9%
26
Gastroenterology
40 papers in training set
Top 2%
0.8%
27
NeuroImage: Clinical
132 papers in training set
Top 4%
0.8%
28
Nature Machine Intelligence
61 papers in training set
Top 3%
0.8%
29
EMBO Molecular Medicine
85 papers in training set
Top 4%
0.8%
30
Annals of Neurology
57 papers in training set
Top 2%
0.7%