Back

CT4CMS: Preoperative Computed Tomography-Based Consensus Molecular Subtyping Prediction in Colorectal Cancer Using Interpretable Deep Learning

Zhang, X.; Nie, X.; Wu, T.; Cai, D.; Xue, H.; Qi, L.; Wang, Y.; Cao, Y.; He, L.; Zhang, Y.; Cheng, Y.; Wang, H.; Wang, X.; Li, E.; Dong, Y.; Gao, F.; Wang, X.

2026-03-10 oncology
10.64898/2026.03.08.26347898 medRxiv
Show abstract

Consensus molecular subtyping (CMS) defines the transcriptomic taxonomy of colorectal cancer (CRC) and guides precision therapy. Although current approaches can predict CMS from histopathology, they rely on surgical specimens, limiting their preoperative applicability. In this study, we developed a deep learning model to infer CMS directly from preoperative computed tomography (CT) scans, enabling noninvasive molecular stratification of CRC. A multi-institutional cohort of 2,444 CRC patients was collected from the Sixth Affiliated Hospital of Sun Yat-sen University and Liaoning Cancer Hospital, comprising a discovery cohort (n = 416), an internal validation cohort (n = 1,671), and an external validation cohort (n = 357). To achieve robust feature extraction, a self-supervised 3D representation learning network was first pretrained on large-scale public CT datasets to capture generalizable imaging features. These representations were subsequently integrated into a multi-instance learning (MIL) classifier for CMS prediction, with attention mechanisms to enhance interpretability. Model performance was evaluated by cross-validation on the discovery cohort and verified on the two validation cohorts. CT4CMS demonstrated strong performance in predicting CMS subtypes directly from CT scans, achieving a cross-validation AUC of 0.867. In both validation cohorts, patients predicted as CMS4 exhibited significantly poorer disease-free survival yet derived substantial benefit from adjuvant chemotherapy, consistent with transcriptome-defined subtyping trends observed in the discovery cohort. Interpretability analysis revealed distinct subtype-specific radiomic features, suggesting that CT-derived imaging features capture underlying molecular characteristics and enable CMS classification. Overall, this study establishes a noninvasive and interpretable deep learning framework for CMS prediction in CRC, paving the way for imaging-based molecular stratification and personalized therapeutic decision-making.

Matching journals

The top 13 journals account for 50% of the predicted probability mass.

1
Nature Communications
4913 papers in training set
Top 22%
8.4%
2
Cancer Cell
38 papers in training set
Top 0.1%
6.8%
3
Scientific Reports
3102 papers in training set
Top 24%
4.8%
4
Clinical Cancer Research
58 papers in training set
Top 0.5%
3.6%
5
Cancer Research
116 papers in training set
Top 0.9%
3.6%
6
Molecular Cancer
14 papers in training set
Top 0.1%
3.6%
7
European Journal of Cancer
10 papers in training set
Top 0.1%
3.6%
8
Cell Reports Medicine
140 papers in training set
Top 2%
3.2%
9
npj Digital Medicine
97 papers in training set
Top 1%
3.2%
10
npj Precision Oncology
48 papers in training set
Top 0.2%
3.1%
11
Theranostics
33 papers in training set
Top 0.3%
2.6%
12
Journal of Translational Medicine
46 papers in training set
Top 0.5%
2.1%
13
Frontiers in Oncology
95 papers in training set
Top 2%
1.9%
50% of probability mass above
14
Signal Transduction and Targeted Therapy
29 papers in training set
Top 0.6%
1.9%
15
Advanced Science
249 papers in training set
Top 9%
1.9%
16
iScience
1063 papers in training set
Top 12%
1.9%
17
Annals of Oncology
13 papers in training set
Top 0.5%
1.7%
18
PLOS ONE
4510 papers in training set
Top 55%
1.7%
19
eLife
5422 papers in training set
Top 43%
1.7%
20
Cancer Letters
32 papers in training set
Top 0.3%
1.5%
21
eBioMedicine
130 papers in training set
Top 2%
1.5%
22
Cancers
200 papers in training set
Top 4%
1.2%
23
IEEE Transactions on Medical Imaging
18 papers in training set
Top 0.4%
0.9%
24
JNCI: Journal of the National Cancer Institute
16 papers in training set
Top 0.5%
0.9%
25
Briefings in Bioinformatics
326 papers in training set
Top 5%
0.9%
26
Genomics, Proteomics & Bioinformatics
171 papers in training set
Top 5%
0.9%
27
Cell Reports
1338 papers in training set
Top 31%
0.9%
28
Communications Biology
886 papers in training set
Top 19%
0.9%
29
JCO Clinical Cancer Informatics
18 papers in training set
Top 0.8%
0.8%
30
PLOS Computational Biology
1633 papers in training set
Top 23%
0.8%