Back

The Crucial Role of Predictive Models in Childhood Asthma care: Improving Outcomes Through Data-Driven Insights

CHAKRABORTY, A.; Bashar, A. R.

2025-07-23 health informatics
10.1101/2025.07.23.25332082 medRxiv
Show abstract

BackgroundAsthma is one of the most prominent chronic diseases in children and one of the most challenging ailments to diagnose in infants and preschoolers in the United States. Predictive models can be instrumental in offering a data-driven approach to improve early diagnosis, personalize treatment strategies, and disease progression. By utilizing nationalized data, this study focuses on building and comparing high-performing analytical predictive models based on the 28 associated risk factors and identifying the most contributing factors influencing childhood asthma. MethodData came from the BRFSS (2011-2020) Asthma Call Back Survey (ACBS). The cross-sectional study included 9813 participants with a response rate of 65% (current asthma status positive). Respondents were randomly divided into training and testing samples. The grid-search mechanism was implemented to compute the optimum values of the hyper-parameters of the analytical eXtreme Gradient Boosting (XGBoost) model. The fitted XGBoost model was compared with four competing ML models, including support vector machine (SVM), random forest, LASSO regression, and GBM. The performance of all the models was compared using accuracy, AUC, precision, and recall. Variable importance plot (VIP) was used to measure the percentage of contribution of the predictors to the response, and Shapley Additive exPlanations (SHAP) plot was used to understand how the predictors are related to the outcome. Chi-square test was used to measure the association between the predictors and the outcome. ResultsAsthma diagnosis was found to vary by age group, with the highest prevalence in kindergarten age (31.44%). Of the five predictive models, the XGBoost was found to be the best performing model with AUC: 0.95, followed by random forest (AUC: 0.9345), GBM (AUC: 0.9341), SVM (AUC 0.9304), and LASSO (AUC 0.88); however, the random forest model was found to have the highest sensitivity (0.9786), and hence preferred for initial screening of asthma. The top two contributing predictors were overnight hospitalization visits and time since the last asthma medication, accounting for 24.62% and 20.92%, respectively, to the asthma status, from the VIP. ConclusionThe analytical methodology of the model development was found to be instrumental in the discovery of behavior health-risk knowledge and to visualize the significance of predictive modeling from a multidimensional behavioral health survey. These insights can be instrumental in predicting different types of chronic lung diseases affecting people of all ages and can be useful for clinicians to diagnose asthma at an early stage, allowing for early intervention and proactive management.

Matching journals

The top 7 journals account for 50% of the predicted probability mass.

1
International Journal of Medical Informatics
25 papers in training set
Top 0.1%
18.7%
2
PLOS ONE
4510 papers in training set
Top 19%
10.1%
3
Journal of Medical Internet Research
85 papers in training set
Top 0.7%
6.4%
4
Scientific Reports
3102 papers in training set
Top 27%
4.3%
5
Frontiers in Public Health
140 papers in training set
Top 2%
4.0%
6
JMIR Medical Informatics
17 papers in training set
Top 0.3%
3.7%
7
PLOS Digital Health
91 papers in training set
Top 0.7%
3.6%
50% of probability mass above
8
European Respiratory Journal
54 papers in training set
Top 0.5%
3.6%
9
Computers in Biology and Medicine
120 papers in training set
Top 0.9%
3.6%
10
BMC Medical Informatics and Decision Making
39 papers in training set
Top 0.9%
3.1%
11
Frontiers in Digital Health
20 papers in training set
Top 0.3%
2.9%
12
ERJ Open Research
44 papers in training set
Top 0.4%
2.1%
13
eClinicalMedicine
55 papers in training set
Top 0.3%
2.1%
14
International Journal of Environmental Research and Public Health
124 papers in training set
Top 3%
1.9%
15
BMJ Paediatrics Open
21 papers in training set
Top 0.5%
1.3%
16
JMIR Public Health and Surveillance
45 papers in training set
Top 2%
1.3%
17
BMC Medical Research Methodology
43 papers in training set
Top 0.9%
1.2%
18
BMC Infectious Diseases
118 papers in training set
Top 4%
1.2%
19
BMJ Open Respiratory Research
32 papers in training set
Top 0.5%
1.2%
20
Journal of the American Medical Informatics Association
61 papers in training set
Top 2%
1.2%
21
Frontiers in Pharmacology
100 papers in training set
Top 3%
1.1%
22
Journal of Allergy and Clinical Immunology
25 papers in training set
Top 0.6%
1.0%
23
JAMIA Open
37 papers in training set
Top 1%
1.0%
24
BJGP Open
12 papers in training set
Top 0.6%
0.9%
25
BMC Public Health
147 papers in training set
Top 6%
0.6%
26
JMIR Formative Research
32 papers in training set
Top 2%
0.6%
27
Epidemiology and Infection
84 papers in training set
Top 4%
0.5%