Back

Accuracy of Foundation AI Models for Hepatic Macrovesicular Steatosis Quantification in Frozen Sections

Koga, S.; Guda, A.; Wang, Y.; Sahni, A.; Wu, J.; Rosen, A.; Nield, J.; Nandish, N.; Patel, K.; Goldman, H.; Rajapakse, C.; Walle, S.; Kristen, S.; Tondon, R.; Alipour, Z.

2025-09-17 pathology
10.1101/2025.09.16.25335833 medRxiv
Show abstract

IntroductionAccurate intraoperative assessment of macrovesicular steatosis in donor liver biopsies is critical for transplantation decisions but is often limited by inter-observer variability and freezing artifacts that can obscure histological details. Artificial intelligence (AI) offers a potential solution for standardized and reproducible evaluation. To evaluate the diagnostic performance of two self-supervised learning (SSL)-based foundation models, Prov-GigaPath and UNI, for classifying macrovesicular steatosis in frozen liver biopsy sections, compared with assessments by surgical pathologists. MethodsWe retrospectively analyzed 131 frozen liver biopsy specimens from 68 donors collected between November 2022 and September 2024. Slides were digitized into whole-slide images, tiled into patches, and used to extract embeddings with Prov-GigaPath and UNI; slide-level classifiers were then trained and tested. Intraoperative diagnoses by on-call surgical pathologists were compared with ground truth determined from independent reviews of permanent sections by two liver pathologists. Accuracy was evaluated for both five-category classification and a clinically significant binary threshold (<30% vs. [&ge;]30%). ResultsFor binary classification, Prov-GigaPath achieved 96.4% accuracy, UNI 85.7%, and surgical pathologists 84.0% (P = .22). In five-category classification, accuracies were lower: Prov-GigaPath 57.1%, UNI 50.0%, and pathologists 58.7% (P = .70). Misclassification primarily occurred in intermediate categories (5%-<30% steatosis). ConclusionsSSL-based foundation models performed comparably to surgical pathologists in classifying macrovesicular steatosis, at the clinically relevant <30% vs. [&ge;]30% threshold. These findings support the potential role of AI in standardizing intraoperative evaluation of donor liver biopsies; however, the small sample size limits generalizability and requires validation in larger, balanced cohorts.

Matching journals

The top 4 journals account for 50% of the predicted probability mass.

1
Modern Pathology
21 papers in training set
Top 0.1%
22.7%
2
The American Journal of Pathology
31 papers in training set
Top 0.1%
18.8%
3
Biology Methods and Protocols
53 papers in training set
Top 0.1%
8.5%
4
PLOS ONE
4510 papers in training set
Top 31%
4.9%
50% of probability mass above
5
Journal of Clinical Pathology
12 papers in training set
Top 0.1%
2.4%
6
Scientific Reports
3102 papers in training set
Top 53%
1.9%
7
American Journal of Transplantation
15 papers in training set
Top 0.1%
1.9%
8
BMC Medicine
163 papers in training set
Top 3%
1.9%
9
Journal of Pathology Informatics
13 papers in training set
Top 0.2%
1.9%
10
Hepatology Communications
21 papers in training set
Top 0.2%
1.9%
11
Computers in Biology and Medicine
120 papers in training set
Top 2%
1.8%
12
The Journal of Pathology
22 papers in training set
Top 0.1%
1.7%
13
Journal of Medical Imaging
11 papers in training set
Top 0.2%
1.5%
14
Diagnostics
48 papers in training set
Top 1%
1.2%
15
Frontiers in Pharmacology
100 papers in training set
Top 3%
1.2%
16
Cancers
200 papers in training set
Top 4%
1.2%
17
Clinical and Translational Science
21 papers in training set
Top 0.7%
1.1%
18
Journal of Clinical Medicine
91 papers in training set
Top 5%
1.0%
19
eLife
5422 papers in training set
Top 51%
1.0%
20
NMR in Biomedicine
24 papers in training set
Top 0.3%
0.9%
21
Annals of Internal Medicine
27 papers in training set
Top 0.7%
0.9%
22
American Journal of Physiology-Gastrointestinal and Liver Physiology
11 papers in training set
Top 0.2%
0.8%
23
Journal of Biophotonics
16 papers in training set
Top 0.6%
0.8%
24
Frontiers in Medicine
113 papers in training set
Top 6%
0.8%
25
eBioMedicine
130 papers in training set
Top 4%
0.8%
26
JAMA Network Open
127 papers in training set
Top 4%
0.8%
27
The American Journal of Tropical Medicine and Hygiene
60 papers in training set
Top 4%
0.7%
28
JCI Insight
241 papers in training set
Top 8%
0.6%
29
Hepatology
18 papers in training set
Top 0.4%
0.6%
30
The Lancet
16 papers in training set
Top 0.9%
0.6%