Back

An Interpretable Deep Learning Framework for Biomarker Discovery in Complex Disease Survival Outcomes

Wan, S.; Mi, X.; Zou, F.; Zou, B.

2025-10-01 bioinformatics
10.1101/2025.09.30.679415 bioRxiv
Show abstract

Identification of important biomarkers associated with complex disease survival outcomes is fundamental for gaining an in-depth understanding of disease mechanisms and advancing precision medicine in conditions such as cancer and cardiovascular disorders. However, these tasks are complicated by the unique nature of time-to-event data, which captures both the occurrence and timing of clinical events. Notably, complex associations such as the non-linear and non-additive biomarker interactions and the high-dimensionality challenge conventional survival data modeling approaches. To address these difficulties, we propose SurvDNN, an enhanced deep neural network framework specifically designed for survival outcomes modeling. SurvDNN incorporates a bootstrapping-based regularization strategy to mitigate overfitting and a novel stability-driven filtering algorithm to improve model robustness. To enable interpretable biomarker discovery, we extend the Permutation-based Feature Importance Test (PermFIT) to survival settings, allowing rigorous quantification of individual biomarker contributions under complex biomarker-outcome associations. Through extensive simulations and applications to real-world datasets, SurvDNN consistently outperforms existing machine learning approaches in both biomarker identification and predictive accuracy. Our results demonstrate the potential of SurvDNN coupled with PermFIT as an interpretable, robust, and powerful tool for biomarker-driven survival modeling in complex diseases. An open-source R package implementing SurvDNN is publicly available on GitHub (https://github.com/BZou-lab/SurvDNN).

Matching journals

The top 6 journals account for 50% of the predicted probability mass.

1
Patterns
70 papers in training set
Top 0.1%
17.2%
2
Advanced Science
249 papers in training set
Top 1%
9.9%
3
Nature Communications
4913 papers in training set
Top 23%
8.3%
4
Proceedings of the National Academy of Sciences
2130 papers in training set
Top 12%
6.3%
5
Bioinformatics
1061 papers in training set
Top 4%
6.3%
6
Nature Machine Intelligence
61 papers in training set
Top 0.5%
6.2%
50% of probability mass above
7
Briefings in Bioinformatics
326 papers in training set
Top 1.0%
6.2%
8
Scientific Reports
3102 papers in training set
Top 32%
3.9%
9
PLOS Computational Biology
1633 papers in training set
Top 11%
3.0%
10
Genome Medicine
154 papers in training set
Top 3%
2.4%
11
Communications Biology
886 papers in training set
Top 7%
1.9%
12
Frontiers in Genetics
197 papers in training set
Top 4%
1.8%
13
Nature Biomedical Engineering
42 papers in training set
Top 0.9%
1.7%
14
PLOS ONE
4510 papers in training set
Top 59%
1.3%
15
Medical Image Analysis
33 papers in training set
Top 0.7%
1.3%
16
npj Systems Biology and Applications
99 papers in training set
Top 2%
1.1%
17
Science Advances
1098 papers in training set
Top 25%
0.9%
18
IEEE Journal of Biomedical and Health Informatics
34 papers in training set
Top 2%
0.9%
19
Computational and Structural Biotechnology Journal
216 papers in training set
Top 10%
0.7%
20
Cell Systems
167 papers in training set
Top 13%
0.7%
21
BMC Bioinformatics
383 papers in training set
Top 7%
0.7%
22
Genome Research
409 papers in training set
Top 5%
0.7%
23
The American Journal of Human Genetics
206 papers in training set
Top 4%
0.7%
24
NAR Genomics and Bioinformatics
214 papers in training set
Top 4%
0.6%
25
npj Digital Medicine
97 papers in training set
Top 4%
0.6%
26
Biometrics
22 papers in training set
Top 0.3%
0.6%