Back

Adaptive Integration of Heterogeneous Foundation Models to Find Histologically Predictable Genes in Breast Cancer

Nguyen, H.; Li, C.; Peng, C.; Simpson, P.; Ye, N.; Nguyen, Q.

2026-04-08 bioinformatics
10.64898/2026.04.05.716435 bioRxiv
Show abstract

Foundation models for computational pathology have rapidly emerged as powerful tools for extracting rich biological and morphological representations from histopathology images. However, variations in model architecture, pre-training data, and optimization objectives often lead to task-dependent performance, rather than universal generalization. As a result, effective strategies for integrating their complementary strengths are essential to fully realize the potential of foundation models for robust histopathology analysis. Meanwhile, recent breakthroughs such as spatial transcriptomics provide an unprecedented opportunity to integrate genetic and histopathology information from the same patient sample, thereby maximizing both molecular and anatomical pathology insights. Specifically, each models embedding is first mapped to gene-level predictions via a dedicated prediction head, enabling model-specific feature utilization. A lightweight weighting network then adaptively aggregates these predictions to produce a unified and robust output at gene and spatial location levels. Across multiple spatial transcriptomics datasets, our approach consistently outperforms both individual foundation models and classical ensembling methods. Focusing on breast cancer, we observe substantial gains in prediction accuracy for clinically relevant PAM50 subtype markers and drug-target genes. Moreover, the proposed framework improves interpretability by revealing model-specific contributions and specialization at the gene level. Overall, our work presents an effective solution to integrating multiple foundation models for enhancing the genetic analyses of histopathology images.

Matching journals

The top 8 journals account for 50% of the predicted probability mass.

1
Advanced Science
249 papers in training set
Top 0.5%
15.0%
2
Bioinformatics
1061 papers in training set
Top 3%
8.6%
3
Nature Communications
4913 papers in training set
Top 28%
6.5%
4
PLOS Computational Biology
1633 papers in training set
Top 7%
5.0%
5
Briefings in Bioinformatics
326 papers in training set
Top 1%
5.0%
6
Medical Image Analysis
33 papers in training set
Top 0.3%
4.4%
7
iScience
1063 papers in training set
Top 4%
3.7%
8
IEEE Transactions on Computational Biology and Bioinformatics
17 papers in training set
Top 0.1%
2.1%
50% of probability mass above
9
Computational and Structural Biotechnology Journal
216 papers in training set
Top 3%
2.1%
10
IEEE Journal of Biomedical and Health Informatics
34 papers in training set
Top 0.7%
2.1%
11
Scientific Reports
3102 papers in training set
Top 53%
1.9%
12
Nature Machine Intelligence
61 papers in training set
Top 2%
1.9%
13
Nucleic Acids Research
1128 papers in training set
Top 9%
1.9%
14
Patterns
70 papers in training set
Top 0.7%
1.8%
15
Proceedings of the National Academy of Sciences
2130 papers in training set
Top 30%
1.8%
16
npj Systems Biology and Applications
99 papers in training set
Top 1.0%
1.8%
17
Cell Systems
167 papers in training set
Top 7%
1.7%
18
Communications Biology
886 papers in training set
Top 8%
1.7%
19
IEEE Transactions on Medical Imaging
18 papers in training set
Top 0.3%
1.5%
20
Frontiers in Genetics
197 papers in training set
Top 5%
1.5%
21
Genome Medicine
154 papers in training set
Top 6%
1.3%
22
Bioinformatics Advances
184 papers in training set
Top 4%
1.0%
23
BMC Bioinformatics
383 papers in training set
Top 6%
0.9%
24
PLOS ONE
4510 papers in training set
Top 64%
0.9%
25
Science Advances
1098 papers in training set
Top 26%
0.9%
26
Cancer Research
116 papers in training set
Top 3%
0.8%
27
Cell Genomics
162 papers in training set
Top 6%
0.8%
28
Genome Research
409 papers in training set
Top 4%
0.8%
29
GigaScience
172 papers in training set
Top 3%
0.7%
30
Cell Reports Medicine
140 papers in training set
Top 8%
0.7%