Back

LiteMIL: A Computationally Efficient Transformer-Based MIL for Cancer Subtyping on Whole Slide Images.

Kussaibi, H.

2025-05-12 pathology
10.1101/2025.05.11.25327389 medRxiv
Show abstract

PurposeAccurate cancer subtyping is crucial for effective treatment; however, it presents challenges due to overlapping morphology and variability among pathologists. Although deep learning (DL) methods have shown potential, their application to gigapixel whole slide images (WSIs) is often hindered by high computational demands and the need for efficient, context-aware feature aggregation. This study introduces LiteMIL, a computationally efficient transformer-based multiple instance learning (MIL) network combined with Phikon, a pathology-tuned self-supervised feature extractor, for robust and scalable cancer subtyping on WSIs. MethodsInitially, patches were extracted from TCGA-THYM dataset (242 WSIs, six subtypes) and subsequently fed in real-time to Phikon for feature extraction. To train MILs, features were arranged into uniform bags using a chunking strategy that maintains tissue context while increasing training data. LiteMIL utilizes a learnable query vector within an optimized multi-head attention module for effective feature aggregation. The models performance was evaluated against established MIL methods on the Thymic Dataset and three additional TCGA datasets (breast, lung, and kidney cancer). ResultsLiteMIL achieved 0.89 {+/-} 0.01 F1 score and 0.99 AUC on Thymic dataset, outperforming other MILs. LiteMIL demonstrated strong generalizability across the external datasets, scoring the best on breast and kidney cancer datasets. Compared to TransMIL, LiteMIL significantly reduces training time and GPU memory usage. Ablation studies confirmed the critical role of the learnable query and layer normalization in enhancing performance and stability. ConclusionLiteMIL offers a resource-efficient, robust solution. Its streamlined architecture, combined with the compact Phikon features, makes it suitable for integrating into routine histopathological workflows, particularly in resource-limited settings.

Matching journals

The top 2 journals account for 50% of the predicted probability mass.

1
Journal of Pathology Informatics
13 papers in training set
Top 0.1%
40.5%
2
Modern Pathology
21 papers in training set
Top 0.1%
10.7%
50% of probability mass above
3
Medical Image Analysis
33 papers in training set
Top 0.2%
5.0%
4
Scientific Reports
3102 papers in training set
Top 43%
2.8%
5
Biology Methods and Protocols
53 papers in training set
Top 0.5%
2.4%
6
Nature Communications
4913 papers in training set
Top 46%
2.2%
7
Computers in Biology and Medicine
120 papers in training set
Top 2%
1.8%
8
eBioMedicine
130 papers in training set
Top 1%
1.7%
9
IEEE Journal of Biomedical and Health Informatics
34 papers in training set
Top 1.0%
1.7%
10
GigaScience
172 papers in training set
Top 1%
1.7%
11
Journal of Medical Imaging
11 papers in training set
Top 0.1%
1.5%
12
Clinical Chemistry
22 papers in training set
Top 0.4%
1.5%
13
Computational and Structural Biotechnology Journal
216 papers in training set
Top 5%
1.4%
14
PLOS ONE
4510 papers in training set
Top 59%
1.3%
15
npj Precision Oncology
48 papers in training set
Top 0.8%
1.3%
16
Cancers
200 papers in training set
Top 4%
1.1%
17
BMC Medicine
163 papers in training set
Top 5%
1.1%
18
BMC Medical Informatics and Decision Making
39 papers in training set
Top 2%
1.0%
19
Laboratory Investigation
13 papers in training set
Top 0.2%
0.9%
20
The Journal of Pathology
22 papers in training set
Top 0.3%
0.9%
21
Breast Cancer Research
32 papers in training set
Top 0.4%
0.9%
22
iScience
1063 papers in training set
Top 25%
0.9%
23
Diagnostics
48 papers in training set
Top 2%
0.9%
24
Bioinformatics
1061 papers in training set
Top 9%
0.8%
25
The American Journal of Pathology
31 papers in training set
Top 0.4%
0.8%
26
Genomics, Proteomics & Bioinformatics
171 papers in training set
Top 6%
0.8%
27
Biological Imaging
15 papers in training set
Top 0.2%
0.7%
28
Journal of Medical Internet Research
85 papers in training set
Top 5%
0.7%
29
PLOS Computational Biology
1633 papers in training set
Top 25%
0.7%
30
Neuropathology and Applied Neurobiology
14 papers in training set
Top 0.9%
0.5%