Back

TumorArchetypeR: A modular framework to derive signature-based tumor subtypes

Luetge, M.; Nassiri, S.

2026-05-14 cancer biology
10.64898/2026.05.11.724259 bioRxiv
Show abstract

MotivationThe tumor microenvironment (TME) dictates cancer progression and therapeutic response, yet translating TME subtypes into robust clinical biomarkers remains a significant challenge. Existing classification models typically rely on static gene signatures and cohort-dependent normalization, making them ill-suited for application to the small, unbalanced datasets common in early-phase clinical trials. To better guide drug development, methods are required that offer the flexibility to target specific biological contexts and bridge the gap between the discovery of tumor archetypes and their robust translation to individual patient samples. ResultsWe developed TumorArchetypeR, a modular R package that unifies unsupervised subtype discovery with the generation of rank-based, single-sample classifiers. By leveraging a systematic parameter grid search, the framework identifies stable, data-driven subtypes rather than relying on arbitrary defaults. Crucially, to ensure clinical translatability, the package includes a module to train a robust classifier using binary gene-pair rules, enabling prediction without cohort-level preprocessing. Applying TumorArchetypeR to colorectal cancer, we resolved the heterogeneity of fibrotic tumors, distinguishing an immunosuppressive "Immune-enriched/Fibrotic" state from an immune-excluded "Fibrotic/Myeloid" phenotype. Furthermore, we identified a distinct "Th/B-cell enriched" archetype associated with superior survival, a group largely obscured by existing pan-cancer models. With our rank-based classifier demonstrating robust performance on previously unseen samples, these findings highlight TumorArchetypeR as a scalable, end-to-end solution for refining patient stratification and optimizing precision oncology strategies. The TumorArchetypeR package and documentation are openly available on GitHub at https://github.com/lutgem/TumorArchetypeR.

Matching journals

The top 6 journals account for 50% of the predicted probability mass.

1
Clinical Cancer Research
58 papers in training set
Top 0.1%
12.5%
2
npj Precision Oncology
48 papers in training set
Top 0.1%
10.5%
3
Nature Communications
4913 papers in training set
Top 18%
10.1%
4
Nature Cancer
35 papers in training set
Top 0.1%
6.8%
5
Genome Medicine
154 papers in training set
Top 0.9%
6.4%
6
Cell Reports Medicine
140 papers in training set
Top 0.7%
4.9%
50% of probability mass above
7
JCO Clinical Cancer Informatics
18 papers in training set
Top 0.2%
4.9%
8
Cancer Research
116 papers in training set
Top 0.8%
3.6%
9
npj Digital Medicine
97 papers in training set
Top 1%
3.3%
10
Nature Medicine
117 papers in training set
Top 1%
3.1%
11
Journal for ImmunoTherapy of Cancer
64 papers in training set
Top 0.4%
2.6%
12
PLOS Computational Biology
1633 papers in training set
Top 13%
2.4%
13
PLOS ONE
4510 papers in training set
Top 47%
2.1%
14
Cancer Cell
38 papers in training set
Top 0.9%
1.8%
15
Cancer Discovery
61 papers in training set
Top 1%
1.7%
16
Cancer Research Communications
46 papers in training set
Top 0.4%
1.7%
17
Journal of Translational Medicine
46 papers in training set
Top 1%
1.2%
18
Scientific Reports
3102 papers in training set
Top 69%
1.0%
19
eBioMedicine
130 papers in training set
Top 3%
0.9%
20
Cell Systems
167 papers in training set
Top 12%
0.7%
21
Patterns
70 papers in training set
Top 2%
0.7%
22
npj Systems Biology and Applications
99 papers in training set
Top 2%
0.7%
23
Bioinformatics Advances
184 papers in training set
Top 5%
0.7%
24
Breast Cancer Research
32 papers in training set
Top 0.5%
0.7%
25
npj Breast Cancer
18 papers in training set
Top 0.2%
0.6%
26
Molecular Cancer Therapeutics
33 papers in training set
Top 0.8%
0.6%
27
Science Translational Medicine
111 papers in training set
Top 7%
0.6%