Back

Closing the Paediatric Gap: Adult-Trained AI Generalises Robustly to Paediatric Coeliac Disease Diagnosis

Jaeckle, F.; Gillett, P. M.; Kirkwood, K. J.; Natu, S.; Chan, J. Y. H.; Bateman, A. C.; Arends, M. J.; Soilleux, E. J.

2026-06-05 pathology
10.64898/2026.06.04.26354889 medRxiv
Show abstract

Background Coeliac disease (CD) diagnosis on duodenal biopsies is limited by interobserver variability. We have previously demonstrated pathologist-level performance with our artificial intelligence (AI) model for the histopathological diagnosis of adult CD, but not in paediatric practice. As paediatric CD screening programmes expand internationally, accurate and scalable diagnostic tools are needed. We investigated whether an AI model trained exclusively on adult whole-slide images (WSIs) can generalise to paediatric CD diagnosis across independent centres. Methods A training and validation dataset of 9,958 WSIs from 8,421 adult patients (961 CD) from five centres was used to develop an ensemble of multiple-instance learning models using features from a foundation model. Testing was performed on 708 consecutive paediatric patients (86 CD) from two centres (Edinburgh and Southampton) not included in training. Model calibration was assessed, and probability outputs were grouped into clinically interpretable categories. Findings In adult cross-validation, the AI model achieved an area under the receiver operating characteristic curve (AUC) of 98.7%, sensitivity of 84.9%, specificity of 99.0%, and negative predictive value (NPV) of 98.1%. On testing (paediatric) datasets, performance remained high (AUC 98.8%, sensitivity 80.2%, specificity 98.4%, NPV 97.3%). Restricting analysis to predictions outside the intermediate-probability range (predicted CD probability <10% or [&ge;]65%; 85.3% of cases) improved sensitivity to 100% and specificity to 98.7%. No misclassifications were observed among high-confidence predictions (<2% or [&ge;]85%; 66.0% of cases). The expected calibration error was 0.03. Performance improved significantly when biopsies from both duodenal sites (bulb [D1] and descending [D2/3]) were considered. Interpretation Our AI model, trained on adult biopsies, generalises to paediatric CD diagnosis across centres and scanner platforms. Well-calibrated probability outputs provide clinically interpretable measures of diagnostic confidence and could support safe identification of CD-negative biopsies within defined thresholds. These findings demonstrate the feasibility of applying adult-derived AI models in paediatric populations and reinforce the importance of multi-site (D1 & D2) biopsy sampling.

Matching journals

The top 6 journals account for 50% of the predicted probability mass.

1
Modern Pathology
21 papers in training set
Top 0.1%
23.9%
2
BMC Medicine
163 papers in training set
Top 0.4%
7.2%
3
PLOS ONE
4510 papers in training set
Top 23%
7.2%
4
Scientific Reports
3102 papers in training set
Top 20%
5.1%
5
Journal of Clinical Pathology
12 papers in training set
Top 0.1%
4.6%
6
The Lancet Digital Health
25 papers in training set
Top 0.2%
2.8%
50% of probability mass above
7
The Journal of Pathology
22 papers in training set
Top 0.1%
2.2%
8
PLOS Computational Biology
1633 papers in training set
Top 14%
2.0%
9
Journal of Clinical Microbiology
120 papers in training set
Top 0.9%
1.9%
10
Nature Communications
4913 papers in training set
Top 50%
1.8%
11
eBioMedicine
130 papers in training set
Top 1%
1.8%
12
Journal of Pathology Informatics
13 papers in training set
Top 0.2%
1.8%
13
Diagnostics
48 papers in training set
Top 1%
1.4%
14
The Lancet
16 papers in training set
Top 0.4%
1.3%
15
The American Journal of Pathology
31 papers in training set
Top 0.3%
1.3%
16
British Journal of Cancer
42 papers in training set
Top 1%
1.0%
17
Gastroenterology
40 papers in training set
Top 1%
1.0%
18
Kidney International
25 papers in training set
Top 0.3%
1.0%
19
Kidney International Reports
14 papers in training set
Top 0.2%
0.8%
20
Genetics in Medicine
69 papers in training set
Top 0.9%
0.8%
21
BMJ Paediatrics Open
21 papers in training set
Top 0.7%
0.8%
22
Kidney360
22 papers in training set
Top 0.5%
0.8%
23
Journal of Medical Imaging
11 papers in training set
Top 0.3%
0.8%
24
Cell Reports Medicine
140 papers in training set
Top 8%
0.8%
25
Med
38 papers in training set
Top 0.8%
0.8%
26
Diabetologia
36 papers in training set
Top 0.9%
0.8%
27
JCI Insight
241 papers in training set
Top 7%
0.8%
28
Computer Methods and Programs in Biomedicine
27 papers in training set
Top 0.9%
0.8%
29
Biology Methods and Protocols
53 papers in training set
Top 3%
0.7%
30
PLOS Neglected Tropical Diseases
378 papers in training set
Top 5%
0.7%