Back

Pan-cancer survival modeling reveals structural limits of genomic feature integration in immunotherapy outcomes

Hassan, W.; Adeleke, S.

2026-04-18 bioinformatics
10.64898/2026.04.15.718634 bioRxiv
Show abstract

BackgroundImmune checkpoint inhibitors (ICIs) have improved outcomes across multiple cancer types, yet reliable predictors of survival remain limited. While genomic features such as tumor mutational burden (TMB) are widely used, their contribution to predictive modeling in heterogeneous real-world cohorts remains unclear. We evaluated the relative contributions of clinical and whole-genome sequencing (WGS) features in pan-cancer survival modeling. MethodsWe analyzed 658 patients treated with ICIs with matched WGS data from the Genomics England. Using a leakage-controlled machine learning framework with strict train-test separation, we compared four models: TMB-only, clinical-only, clinical+TMB, and an integrated 11-feature clinico-genomic XGBoost survival model. Model performance was assessed using Harrells concordance index (C-index) with bootstrap confidence intervals. ResultsTMB alone demonstrated near-random discrimination (C-index 0.50; 95% CI 0.44-0.56). Clinical variables substantially improved predictive performance (0.59; 95% CI 0.53-0.64), with marginal gain from adding TMB (0.59). The integrated model achieved a C-index of 0.60 (95% CI 0.55-0.65). While improvement over TMB alone was significant, incremental gain beyond optimized clinical models was modest. Feature attribution analysis showed that model performance was dominated by clinical variables, with genomic features contributing limited additional signal. ConclusionsThese findings suggest that, in heterogeneous pan-cancer cohorts, predictive performance is constrained by the underlying data structure, in which dominant clinical signals overshadow genome-scale features. This study highlights fundamental limitations in integrating genomic data into survival models across diverse cancer types and provides a benchmark for future computational approaches.

Matching journals

The top 8 journals account for 50% of the predicted probability mass.

1
npj Precision Oncology
48 papers in training set
Top 0.1%
8.3%
2
PLOS Computational Biology
1633 papers in training set
Top 5%
7.1%
3
Nature Communications
4913 papers in training set
Top 27%
6.7%
4
Cancer Research Communications
46 papers in training set
Top 0.1%
6.3%
5
Genome Medicine
154 papers in training set
Top 1%
6.3%
6
JCO Clinical Cancer Informatics
18 papers in training set
Top 0.1%
6.3%
7
Bioinformatics
1061 papers in training set
Top 4%
4.8%
8
Journal for ImmunoTherapy of Cancer
64 papers in training set
Top 0.2%
4.8%
50% of probability mass above
9
BMC Bioinformatics
383 papers in training set
Top 2%
4.3%
10
PLOS ONE
4510 papers in training set
Top 40%
3.6%
11
Scientific Reports
3102 papers in training set
Top 42%
3.0%
12
Cell Reports Medicine
140 papers in training set
Top 3%
2.1%
13
Cancers
200 papers in training set
Top 3%
1.6%
14
Clinical Infectious Diseases
231 papers in training set
Top 3%
1.6%
15
JNCI Cancer Spectrum
10 papers in training set
Top 0.3%
1.5%
16
JCO Precision Oncology
14 papers in training set
Top 0.2%
1.3%
17
Cell Systems
167 papers in training set
Top 9%
1.3%
18
Journal of Translational Medicine
46 papers in training set
Top 2%
1.2%
19
Leukemia
39 papers in training set
Top 0.6%
1.2%
20
Cancer Research
116 papers in training set
Top 3%
1.2%
21
Bioinformatics Advances
184 papers in training set
Top 4%
0.9%
22
Frontiers in Immunology
586 papers in training set
Top 7%
0.9%
23
Biology Methods and Protocols
53 papers in training set
Top 2%
0.9%
24
PeerJ
261 papers in training set
Top 13%
0.9%
25
International Journal of Cancer
42 papers in training set
Top 1%
0.8%
26
Frontiers in Bioinformatics
45 papers in training set
Top 1.0%
0.7%
27
Clinical Pharmacology & Therapeutics
25 papers in training set
Top 0.8%
0.7%
28
Proceedings of the National Academy of Sciences
2130 papers in training set
Top 46%
0.7%
29
Clinical Cancer Research
58 papers in training set
Top 2%
0.6%
30
Briefings in Bioinformatics
326 papers in training set
Top 8%
0.6%