Back

Interpretable machine learning applied to high-dimensional salivary proteomics accurately classifies pediatric inflammatory bowel diseases

Rupp, B. T.; Reyna, J.; Giunta, A.; Weaver, T.; Chason, K.; Liu, J.; Gulati, A. S.; Byrd, K. M.

2025-10-17 gastroenterology
10.1101/2025.10.14.25337919 medRxiv
Show abstract

Background and aimsInflammatory bowel diseases (IBD), including Crohns disease (CD), ulcerative colitis (UC), and IBD-unclassified (IBD-U), are chronic inflammatory disorders of the gastrointestinal tract. Current methods for classification and longitudinal monitoring are invasive, expensive, and often delayed, limiting timely diagnosis and management. This study reports the first application of high-dimensional salivary proteomics integrated with interpretable artificial intelligence/machine learning (AI/ML) to define a minimal protein signature for pediatric IBD classification with the goal of informing therapeutic decision-making. MethodsUnstimulated saliva from pediatric CD, UC, and IBD-U patients was analyzed using Alamar Biosciences NULISAseq Inflammation Panel 250 (250 proteins). Logistic regression with recursive feature elimination identified a minimal discriminative signature. Performance was tested in independent follow-up samples. SHapley Additive exPlanations (SHAP) quantified patient-specific protein contributions and assessed biological similarity of IBD-U to CD and UC. ResultsDifferential abundance analysis between UC and CD revealed 53 significantly different proteins. ML identified a 14-protein signature comprising chemokines/cytokines (CCL1, IFNA1;IFNA13, IL12p70, IL34, TNFSF11/RANKL), receptors/ligands (CD40LG, ICOSLG, IL1R2, IL17RA), structural/tissue-remodeling proteins (CD93, GFAP, SPP1), and growth factors/immune modulators (GDF2, GZMA). The model achieved 96.2% overall accuracy in first-visit samples and 86.4% overall accuracy in follow-up testing. SHAP revealed patient-specific drivers and suggested biological alignment of IBD-U cases toward CD-like or UC-like profiles. ConclusionsThis first-in-field integration of salivary proteomics with interpretable AI/ML demonstrates that accurate, noninvasive classification of pediatric IBD is possible using minimal biomarker sets. This approach establishes a scalable framework for future longitudinal monitoring, and supports earlier and more precise therapeutic interventions.

Matching journals

The top 8 journals account for 50% of the predicted probability mass.

1
Journal of Proteomics
27 papers in training set
Top 0.1%
23.1%
2
mSystems
361 papers in training set
Top 2%
5.0%
3
Journal of Proteome Research
215 papers in training set
Top 0.6%
5.0%
4
Scientific Reports
3102 papers in training set
Top 34%
3.7%
5
Biomedicines
66 papers in training set
Top 0.2%
3.7%
6
BMC Medicine
163 papers in training set
Top 1%
3.7%
7
eBioMedicine
130 papers in training set
Top 0.3%
3.7%
8
Analytical Chemistry
205 papers in training set
Top 1%
2.7%
50% of probability mass above
9
Frontiers in Cell and Developmental Biology
218 papers in training set
Top 3%
2.5%
10
Immunology & Cell Biology
11 papers in training set
Top 0.1%
2.1%
11
Metabolites
50 papers in training set
Top 0.3%
2.1%
12
Molecular & Cellular Proteomics
158 papers in training set
Top 0.9%
2.1%
13
Bioengineering & Translational Medicine
21 papers in training set
Top 0.3%
2.1%
14
Nature Communications
4913 papers in training set
Top 48%
1.9%
15
PROTEOMICS
35 papers in training set
Top 0.4%
1.7%
16
Frontiers in Physiology
93 papers in training set
Top 3%
1.7%
17
The Journal of Clinical Endocrinology & Metabolism
35 papers in training set
Top 0.8%
1.5%
18
Cell Reports Medicine
140 papers in training set
Top 4%
1.5%
19
PLOS ONE
4510 papers in training set
Top 58%
1.4%
20
Frontiers in Medicine
113 papers in training set
Top 4%
1.4%
21
Journal of Clinical Medicine
91 papers in training set
Top 4%
1.3%
22
Frontiers in Cellular and Infection Microbiology
98 papers in training set
Top 4%
1.3%
23
Clinical Proteomics
10 papers in training set
Top 0.1%
0.9%
24
EMBO Molecular Medicine
85 papers in training set
Top 5%
0.7%
25
Science Translational Medicine
111 papers in training set
Top 6%
0.7%
26
Frontiers in Pharmacology
100 papers in training set
Top 5%
0.7%
27
Metabolomics
11 papers in training set
Top 0.5%
0.7%
28
Microbiology Spectrum
435 papers in training set
Top 6%
0.7%
29
eLife
5422 papers in training set
Top 58%
0.7%
30
Endocrinology
38 papers in training set
Top 0.7%
0.7%