Back

TumorCLIP: Lightweight Vision-Language Fusion for Explainable MRI-Based Brain Tumor Classification

Jia, Y.; Niu, J.; Qie, Z.; Li, Z.; Laine, A. F.; Guo, J.

2026-03-13 radiology and imaging
10.64898/2026.03.11.26348155 medRxiv
Show abstract

Accurate classification of brain tumors from MRI is critical for guiding clinical decision-making; however, existing deep learning models are often hindered by limited interpretability and pronounced sensitivity to hyperparameter selection, which constrain their reliability in medical settings. To address these challenges, we propose TumorCLIP, a lightweight and training-efficient vision-language framework that integrates radiology-informed text prototypes with a DenseNet-based visual encoder to support clinically meaningful semantic reasoning, fused via a Tip-Adapter mechanism. TumorCLIP does not aim to introduce a new vision-language model architecture. Instead, its contribution lies in the integration of radiology-informed text proto-types tailored to MRI interpretation, a systematic evaluation of backbone stability across diverse visual architectures, and a lightweight, training-efficient CLIP-based fusion framework designed for medical imaging applications. We first conduct a comprehensive unimodal benchmark across eight representative visual backbones (EfficientNet-B0, MobileNetV3-Large, ResNet50, DenseNet121, ViT, DeiT, Swin Transformer, and MambaOut) using a standardized optimizer and learning-rate grid search, revealing performance swings exceeding 60 percentage points depending on hyperparameter choices. DenseNet121 shows the strongest stability-accuracy trade-off within our evaluated optimizer and learning-rate grid (97.6%). Leveraging this foundation, TumorCLIP fuses image features with frozen CLIP-derived text prototypes, achieving concept-level explainability, robust few-shot adaptation, and enhanced classification of minority tumor classes. On the test set, TumorCLIP attains 98.5% accuracy, including a +1.86 percentage point recall increase for Neurocytoma, suggesting that radiology-informed textual priors can improve semantic alignment and help refine diagnostic decision boundaries within the evaluated setting. Additional evaluation on an independent external dataset shows that TumorCLIP achieves improved cross-dataset performance under the evaluated distribution shift, relative to the unimodal DenseNet121 baseline. These results demonstrate TumorCLIP as a practical, interpretable, and data-efficient alternative to conventional visual classifiers, providing evidence for radiology-aware vision-language alignment in MRI-based brain tumor classification. All results are reported within the evaluated datasets and training protocols.

Matching journals

The top 7 journals account for 50% of the predicted probability mass.

1
Nature Machine Intelligence
61 papers in training set
Top 0.1%
14.3%
2
Nature Communications
4913 papers in training set
Top 14%
12.4%
3
IEEE Transactions on Medical Imaging
18 papers in training set
Top 0.1%
7.1%
4
Nature Medicine
117 papers in training set
Top 0.3%
6.3%
5
Scientific Reports
3102 papers in training set
Top 28%
4.3%
6
Medical Image Analysis
33 papers in training set
Top 0.3%
3.8%
7
Imaging Neuroscience
242 papers in training set
Top 1%
3.6%
50% of probability mass above
8
npj Digital Medicine
97 papers in training set
Top 1%
3.6%
9
NeuroImage
813 papers in training set
Top 3%
2.9%
10
Nature Computational Science
50 papers in training set
Top 0.3%
2.6%
11
Human Brain Mapping
295 papers in training set
Top 2%
2.1%
12
eBioMedicine
130 papers in training set
Top 0.8%
2.1%
13
PLOS ONE
4510 papers in training set
Top 51%
1.9%
14
Medical Physics
14 papers in training set
Top 0.3%
1.9%
15
Patterns
70 papers in training set
Top 0.8%
1.8%
16
Science Translational Medicine
111 papers in training set
Top 3%
1.7%
17
European Radiology
14 papers in training set
Top 0.4%
1.7%
18
Proceedings of the National Academy of Sciences
2130 papers in training set
Top 36%
1.3%
19
Communications Medicine
85 papers in training set
Top 0.5%
1.2%
20
npj Precision Oncology
48 papers in training set
Top 0.9%
1.2%
21
Frontiers in Computational Neuroscience
53 papers in training set
Top 2%
0.9%
22
Science Advances
1098 papers in training set
Top 26%
0.9%
23
Journal of Medical Imaging
11 papers in training set
Top 0.3%
0.8%
24
NeuroImage: Clinical
132 papers in training set
Top 4%
0.8%
25
IEEE Transactions on Biomedical Engineering
38 papers in training set
Top 1.0%
0.7%
26
Frontiers in Neuroscience
223 papers in training set
Top 8%
0.7%
27
Diagnostics
48 papers in training set
Top 2%
0.7%
28
Biology Methods and Protocols
53 papers in training set
Top 3%
0.7%
29
GigaScience
172 papers in training set
Top 3%
0.7%
30
Journal of Biomedical Informatics
45 papers in training set
Top 2%
0.7%