Back

Foundation model-based tool for automated ulcerative colitis histology scoring demonstrates non-inferiority to pathologists across multiple scoring indices

Tahir, W.; Shamshoian, J.; Tauber, J.; Clinton, L. K.; Griffin, M.; Shah, C.; Singh, G.; Fahy, D.; Sucipto, K.; Brosnan-Cashman, J.; Altepeter, T. A.; Bhattacharya, S.; Crandall, W.; Duan, C.; Gale, J. D.; Gupta, V.; Haarmann, H.; Harpaz, N.; Hooper, A. T.; Horowitz, J.; Hurtado-Lorenzo, A.; Hussaini, B. E.; Jairath, V.; Jones, A.; Kostiuk, B.; Kukreja, A.; Laroux, F. S.; Lissoos, T.; McBride, R. B.; Najdawi, F.; Nayyar, A.; Osterman, M. T.; Panchal, P.; Ruane, D.; Travis, S.; Visvanathan, S.; Wilson, L.; Jayson, C.

2026-06-11 pathology
10.64898/2026.06.09.26355212 medRxiv
Show abstract

In clinical trials for ulcerative colitis (UC), pathologists assess disease severity through standardized histological indices, including the Geboes Score, Robarts Histopathology Index (RHI), and Nancy Histologic Index (NHI). Despite strong associations with clinical outcomes, histologic scoring suffers from inter- and intra-reader variability, and consensus criteria for histologic remission remain uncertain. Through a consortium approach, we developed an artificial intelligence-based measurement (AIM) tool for scoring histology in UC mucosal biopsies (AIM-HI UC). This model, trained on a large dataset of UC biopsies (N=10,230), utilizes additive multiple instance learning models leveraging PLUTO, a pathology foundation model, that predict each of the Geboes subgrades, from which the Geboes grade-level score, RHI, and NHI can be calculated. Evaluation of this model on a standalone verification set including clinical trial specimens established algorithm non-inferiority and/or superiority relative to standard qualified pathologists through comparison of algorithm-consensus and pathologist-consensus agreement metrics (non-inferior if difference >-0.1, superior if difference >0, inclusive of confidence intervals). AIM-HI UC was determined to be non-inferior to pathologists (N=3) for the prediction of all seven Geboes subgrades, grade-level Geboes, RHI, NHI, histologic improvement (GS<3.1), 2A histologic remission (GS<2A.0), and 2B histologic remission (GS<2B.0). AIM-HI UC was superior to pathologists for several Geboes subgrades (GS 0, GS 1, GS 2B, and GS 5), as well as grade-level Geboes, RHI, and positive percent agreement of 2A histologic remission. The model was shown to be greater than 99% repeatable for all histologic scoring metrics examined. Model-derived scores were shown to strongly correlate with canonical histologic features of inflammation, including the proportion of total epithelium that is inflamed (Spearman r=0.83; p<0.01), the proportion of neutrophils localized within crypt epithelium (Spearman r=0.83, p<0.01), and the amount of mucosal area classified as erosion or ulceration (Spearman r=0.80, p<0.01). Overall, these results suggest that AIM-HI UC has the potential to improve consistency of UC histology interpretation, providing a path toward standardization of UC histology scoring in clinical trials.

Matching journals

The top 7 journals account for 50% of the predicted probability mass.

1
Modern Pathology
21 papers in training set
Top 0.1%
14.8%
2
PLOS ONE
4510 papers in training set
Top 16%
10.8%
3
Scientific Reports
3102 papers in training set
Top 13%
7.0%
4
JCI Insight
241 papers in training set
Top 0.7%
5.0%
5
Computers in Biology and Medicine
120 papers in training set
Top 0.4%
5.0%
6
Inflammatory Bowel Diseases
15 papers in training set
Top 0.1%
4.1%
7
Gastroenterology
40 papers in training set
Top 0.5%
4.1%
50% of probability mass above
8
npj Digital Medicine
97 papers in training set
Top 1%
3.7%
9
BMC Medicine
163 papers in training set
Top 2%
2.7%
10
American Journal of Physiology-Gastrointestinal and Liver Physiology
11 papers in training set
Top 0.1%
2.1%
11
Cellular and Molecular Gastroenterology and Hepatology
41 papers in training set
Top 0.3%
1.9%
12
The American Journal of Pathology
31 papers in training set
Top 0.1%
1.9%
13
Journal of Clinical Pathology
12 papers in training set
Top 0.1%
1.8%
14
PLOS Computational Biology
1633 papers in training set
Top 16%
1.7%
15
Cell Reports Medicine
140 papers in training set
Top 4%
1.5%
16
Science Translational Medicine
111 papers in training set
Top 3%
1.5%
17
Frontiers in Pharmacology
100 papers in training set
Top 3%
1.3%
18
Journal of Medical Internet Research
85 papers in training set
Top 3%
1.3%
19
Journal of Pathology Informatics
13 papers in training set
Top 0.3%
1.0%
20
Nature Communications
4913 papers in training set
Top 59%
0.9%
21
eLife
5422 papers in training set
Top 53%
0.9%
22
eBioMedicine
130 papers in training set
Top 3%
0.8%
23
Computational and Structural Biotechnology Journal
216 papers in training set
Top 8%
0.8%
24
Journal of Allergy and Clinical Immunology
25 papers in training set
Top 0.7%
0.8%
25
Gut
36 papers in training set
Top 0.8%
0.8%
26
Clinical Cancer Research
58 papers in training set
Top 2%
0.7%
27
Journal of Clinical Investigation
164 papers in training set
Top 7%
0.7%
28
Clinical Pharmacology & Therapeutics
25 papers in training set
Top 0.8%
0.7%
29
Computer Methods and Programs in Biomedicine
27 papers in training set
Top 1%
0.7%
30
International Journal of Molecular Sciences
453 papers in training set
Top 17%
0.7%