Back

Explainable machine learning for revisiting reported Irritable Bowel Syndrome correlates in a student cohort

Ramirez-Lopez, L.; Kang, P.

2026-04-15 gastroenterology
10.64898/2026.04.13.26350820 medRxiv
Show abstract

Irritable Bowel Syndrome (IBS) affects a substantial proportion of university students, yet its factors remain incompletely characterised in South Asian populations. We reanalysed a publicly available dataset of 550 Bangladeshi students from Hasan et al. (2025), conducting a data audit that identified implausible records, including males reporting menstrual symptoms, and reduced the analytic sample to 506 observations. Using Explainable Boosting Machines (EBMs), which capture non-linear effects and pairwise interactions without sacrificing interpretability, we found that psychological distress, elevated BMI and academic dissatisfaction were the strongest predictors of IBS (mean AUC = 0.852 across 100 stratified train-test splits). Critically, several findings diverged from the original logistic regression analysis. Physical activity showed a non-linear risk pattern only at high intensity, the association with gender was substantially weaker when we accounted for metabolic and psychological factors as well and malnourishment does not have a strong an impact as in the original study. These divergences likely arise because the machine-learning model captures non-linear effects and interactions that were not represented in the original regression specification. Our findings underscore the value of reanalysing existing datasets with methods suited to capturing complexity and highlight data quality verification as a necessary step in the secondary analysis.

Matching journals

The top 3 journals account for 50% of the predicted probability mass.

1
PLOS Digital Health
91 papers in training set
Top 0.1%
23.4%
2
Scientific Reports
3102 papers in training set
Top 0.5%
20.2%
3
eLife
5422 papers in training set
Top 8%
8.8%
50% of probability mass above
4
BMC Medicine
163 papers in training set
Top 0.2%
8.8%
5
PeerJ
261 papers in training set
Top 3%
3.0%
6
Frontiers in Psychology
49 papers in training set
Top 0.3%
2.2%
7
Nature Communications
4913 papers in training set
Top 53%
1.5%
8
The Journal of Clinical Endocrinology & Metabolism
35 papers in training set
Top 0.8%
1.4%
9
Nature Human Behaviour
85 papers in training set
Top 3%
1.3%
10
Frontiers in Cellular and Infection Microbiology
98 papers in training set
Top 4%
1.3%
11
Journal of Psychosomatic Research
11 papers in training set
Top 0.2%
1.2%
12
PLOS ONE
4510 papers in training set
Top 61%
1.2%
13
Journal of Affective Disorders
81 papers in training set
Top 1%
1.2%
14
American Journal of Gastroenterology
15 papers in training set
Top 0.3%
0.8%
15
PLOS Computational Biology
1633 papers in training set
Top 23%
0.8%
16
Bioinformatics Advances
184 papers in training set
Top 4%
0.8%
17
Frontiers in Physiology
93 papers in training set
Top 5%
0.8%
18
PNAS Nexus
147 papers in training set
Top 2%
0.7%
19
Heliyon
146 papers in training set
Top 7%
0.7%
20
Biostatistics
21 papers in training set
Top 0.1%
0.7%
21
eBioMedicine
130 papers in training set
Top 4%
0.7%
22
Frontiers in Psychiatry
83 papers in training set
Top 3%
0.7%
23
European Journal of Neuroscience
168 papers in training set
Top 2%
0.7%
24
BMC Genomics
328 papers in training set
Top 7%
0.7%
25
The American Journal of Human Genetics
206 papers in training set
Top 4%
0.7%
26
Disease Models & Mechanisms
119 papers in training set
Top 3%
0.7%
27
EMBO Molecular Medicine
85 papers in training set
Top 5%
0.7%
28
Cancers
200 papers in training set
Top 5%
0.5%
29
Frontiers in Genetics
197 papers in training set
Top 12%
0.5%
30
Frontiers in Cell and Developmental Biology
218 papers in training set
Top 11%
0.5%