Back

Search and Retrieval in Dermatology Atlases of Histopathology Images for Risk Stratification of Cutaneous Squamous Cell Carcinoma

Alabtah, G.; Alsaafin, A.; Alfasly, S.; Shafique, A.; Hemati, S.; Choudhary, A.; Ravishankar, I. K.; DiCaudo, D.; Nelson, S. A.; Stockard, A.; Leibovit-Reiben, Z.; zhang, N.; Kalari, K.; Murphree, D.; Mangold, A.; Comfere, N.; Tizhoosh, H. R.

2026-01-06 pathology
10.64898/2026.01.02.26343356 medRxiv
Show abstract

Cutaneous squamous cell carcinoma (cSCC) poses significant clinical challenges due to its rising incidence and potential for metastasis. Histopathologic risk stratification is further limited by substantial inter-observer variability. Unsupervised AI approaches based on content-based image retrieval offer scalable and interpretable decision support for diagnostic pathology. The objective of this study was to evaluate the use of image retrieval within histopathology atlases to stratify cSCC tumor differentiation from whole-slide images (WSIs), while comparing different patch selection and feature extraction strategies. This retrospective study included 552 archived WSIs comprising 385 well-differentiated, 102 moderately differentiated, and 66 poorly differentiated cases collected across Mayo Clinic sites in Arizona, Florida, and Minnesota. Image atlases were constructed using multiple patch aggregation strategies (Mosaic, Collage, and Montage) and deep learning encoders (KimiaNet, PathDino, and H-Optimus-0). A leave-one-WSI-out evaluation framework was used to assess differentiation classification performance using accuracy, specificity, sensitivity, and F1 score. Mosaic combined with KimiaNet achieved the highest Top-1 accuracy (74.9%) and specificity (92.6%), while Mosaic with H-Optimus-0 yielded the best Top-5 accuracy (79.0%) and macro-F1 score (62.6%). Collage combined with KimiaNet produced the highest Top-5 specificity (99.5%). The generalizability of the evaluated AI models varied across hospitals, reflecting differences in imaging protocols, staining practices, and patient populations. Overall, unsupervised image search and retrieval provides effective, annotation-free support for cSCC differentiation and has the potential to enhance dermatopathology workflows when appropriate combinations of patch selection and feature ex-traction methods are employed.

Matching journals

The top 2 journals account for 50% of the predicted probability mass.

1
Modern Pathology
21 papers in training set
Top 0.1%
33.5%
2
Journal of Pathology Informatics
13 papers in training set
Top 0.1%
18.9%
50% of probability mass above
3
The Journal of Pathology
22 papers in training set
Top 0.1%
4.4%
4
Scientific Reports
3102 papers in training set
Top 30%
4.0%
5
Laboratory Investigation
13 papers in training set
Top 0.1%
3.7%
6
PLOS ONE
4510 papers in training set
Top 38%
3.6%
7
The American Journal of Pathology
31 papers in training set
Top 0.1%
2.1%
8
Computational and Structural Biotechnology Journal
216 papers in training set
Top 5%
1.4%
9
Nature Communications
4913 papers in training set
Top 55%
1.4%
10
Journal of Clinical Pathology
12 papers in training set
Top 0.2%
1.4%
11
Cureus
67 papers in training set
Top 3%
1.4%
12
Diagnostics
48 papers in training set
Top 1%
1.2%
13
British Journal of Cancer
42 papers in training set
Top 1%
1.1%
14
Journal of Medical Imaging
11 papers in training set
Top 0.2%
1.1%
15
Cancers
200 papers in training set
Top 4%
1.1%
16
Breast Cancer Research
32 papers in training set
Top 0.4%
0.9%
17
GigaScience
172 papers in training set
Top 2%
0.9%
18
JAMA Network Open
127 papers in training set
Top 4%
0.9%
19
PLOS Medicine
98 papers in training set
Top 4%
0.8%
20
New Phytologist
309 papers in training set
Top 5%
0.8%
21
Clinical Chemistry
22 papers in training set
Top 0.9%
0.7%
22
eBioMedicine
130 papers in training set
Top 5%
0.7%
23
Biological Imaging
15 papers in training set
Top 0.3%
0.5%
24
npj Digital Medicine
97 papers in training set
Top 4%
0.5%
25
The Journal of Molecular Diagnostics
36 papers in training set
Top 0.6%
0.5%
26
JNCI Cancer Spectrum
10 papers in training set
Top 0.7%
0.5%