Back

Direct pathway enrichment prediction from histopathological whole slide images and comparison with gene expression mediated models

Jabin, A.; Ahmad, S.

2026-03-04 bioinformatics
10.64898/2026.03.02.709137 bioRxiv
Show abstract

Molecular profiling of tumours via RNA sequencing (RNA-seq) enables clinically actionable stratification but remains costly, tissue-intensive, and time-consuming. Recent advances in computational pathology suggest that routine H&E whole-slide images (WSIs) can be utilized to estimate transcriptomic states of cancer cells. Given the WSI-derived predictions of transcriptional signatures are noisy, their use for accurate biological interpretation faces challenges. On the other hand pathway enrichment analysis has been routinely used in describing biologically meaningful cellular states from noisy gene expression data and some studies have evaluated the ability of WSI-predicted gene expression profiles to reconstruct enriched pathways in experiments where the two data modalities were concurrently available. However, it remains unclear if a predictive model that is designed to predict enriched pathways directly from WSI samples would be better than the current approaches to do so by first predicting gene expressions. Here, we develop and evaluate these two complementary approaches for predicting pathway enrichment profiles from WSIs in TCGA Breast Invasive Carcinoma (TCGA-BRCA) by training parallel models which predict pathway enrichment directly from image features and those which rely on predicted gene expression profiles, which is the current state-of-the-art. Our results suggest that under controlled experiments direct prediction of a selected pool of enriched pathways outperforms the models trained on predicting gene expression and then inferring enrichments on predicted gene expression values. These findings will be helpful in prioritizing the goals of predictive modeling of WSI images and improving diagnostic outcomes of cancer patients.

Matching journals

The top 5 journals account for 50% of the predicted probability mass.

1
PLOS Computational Biology
1633 papers in training set
Top 0.7%
22.5%
2
Scientific Reports
3102 papers in training set
Top 10%
8.4%
3
PLOS ONE
4510 papers in training set
Top 25%
6.8%
4
BMC Bioinformatics
383 papers in training set
Top 2%
6.4%
5
Frontiers in Bioinformatics
45 papers in training set
Top 0.1%
6.4%
50% of probability mass above
6
Cancers
200 papers in training set
Top 1%
4.3%
7
Bioinformatics
1061 papers in training set
Top 5%
4.0%
8
Biology Methods and Protocols
53 papers in training set
Top 0.5%
2.4%
9
Computational and Structural Biotechnology Journal
216 papers in training set
Top 4%
1.9%
10
Frontiers in Genetics
197 papers in training set
Top 4%
1.9%
11
Frontiers in Cell and Developmental Biology
218 papers in training set
Top 4%
1.8%
12
iScience
1063 papers in training set
Top 15%
1.7%
13
International Journal of Molecular Sciences
453 papers in training set
Top 8%
1.7%
14
npj Systems Biology and Applications
99 papers in training set
Top 1%
1.3%
15
npj Precision Oncology
48 papers in training set
Top 0.8%
1.2%
16
Modern Pathology
21 papers in training set
Top 0.3%
1.2%
17
Journal of Pathology Informatics
13 papers in training set
Top 0.3%
0.9%
18
GigaScience
172 papers in training set
Top 3%
0.8%
19
Briefings in Bioinformatics
326 papers in training set
Top 6%
0.8%
20
Cancer Research Communications
46 papers in training set
Top 1%
0.7%
21
NAR Genomics and Bioinformatics
214 papers in training set
Top 4%
0.7%
22
Disease Models & Mechanisms
119 papers in training set
Top 3%
0.7%
23
Computer Methods and Programs in Biomedicine
27 papers in training set
Top 1.0%
0.7%
24
Biological Imaging
15 papers in training set
Top 0.3%
0.7%
25
PeerJ
261 papers in training set
Top 16%
0.7%
26
Patterns
70 papers in training set
Top 3%
0.6%
27
Nature Communications
4913 papers in training set
Top 65%
0.6%