Back

A Supervised Learning Framework for Stroke Hospitalization Factors Selection Using the Lasso-MIDAS Model

Li, Q.; Wang, L.

2026-05-20 cardiovascular medicine
10.64898/2026.05.15.26353365 medRxiv
Show abstract

Stroke, as an acute cerebrovascular disease with significant public health implications, is influenced by a complex interplay of meteorological conditions, air quality, and socioeconomic factors. However, the inherent challenges of mixed-frequency data from diverse sources and high-dimensional variable spaces limit the effectiveness of traditional regression models. This study develops a Lasso-MIDAS model framework to identify the key multidimensional drivers of stroke admissions. Using this approach, 21 candidate variables encompassing meteorological, environmental, and economic indicators were screened. The empirical results identified 11 core influencing factors. In the meteorological and environmental dimensions, Wind Speed, Carbon Monoxide (CO), and Sulfur Dioxide (SO2) were identified as significant positive drivers, with Temperature Difference also positively correlating with admission risks. Conversely, Nitrogen Dioxide (NO2) exhibited a negative correlation, potentially reflecting behavioral adaptation and exposure reduction during peak pollution periods. In the socioeconomic dimension, the Consumer Price Index (CPI) for Food, Tobacco, and Alcohol emerged as a major risk factor, highlighting the impact of living cost pressures on public health. The findings demonstrate the superiority of the Lasso-MIDAS model in handling large-scale healthcare data. It effectively addresses the frequency mismatch problem while enhancing the robustness of causal identification through variable shrinkage. These conclusions provide a scientific basis for health authorities to establish early warning systems and optimize public health policy interventions.

Matching journals

The top 8 journals account for 50% of the predicted probability mass.

1
PLOS ONE
4510 papers in training set
Top 11%
15.4%
2
Scientific Reports
3102 papers in training set
Top 5%
10.5%
3
International Journal of Environmental Research and Public Health
124 papers in training set
Top 0.6%
7.1%
4
Computers in Biology and Medicine
120 papers in training set
Top 0.4%
5.1%
5
BMC Medical Informatics and Decision Making
39 papers in training set
Top 0.6%
4.1%
6
PeerJ
261 papers in training set
Top 2%
3.8%
7
Sensors
39 papers in training set
Top 0.5%
3.2%
8
Heliyon
146 papers in training set
Top 0.6%
2.9%
50% of probability mass above
9
PLOS Global Public Health
293 papers in training set
Top 3%
2.6%
10
Science of The Total Environment
179 papers in training set
Top 2%
2.5%
11
Cureus
67 papers in training set
Top 2%
2.2%
12
BMC Medicine
163 papers in training set
Top 3%
2.0%
13
JMIR Medical Informatics
17 papers in training set
Top 0.6%
1.9%
14
Annals of Biomedical Engineering
34 papers in training set
Top 0.6%
1.8%
15
npj Digital Medicine
97 papers in training set
Top 2%
1.7%
16
Environment International
42 papers in training set
Top 0.8%
1.4%
17
IEEE Access
31 papers in training set
Top 0.5%
1.4%
18
Chaos, Solitons & Fractals
32 papers in training set
Top 1%
1.3%
19
Spatial and Spatio-temporal Epidemiology
10 papers in training set
Top 0.1%
1.2%
20
JMIR Public Health and Surveillance
45 papers in training set
Top 3%
1.2%
21
Epidemics
104 papers in training set
Top 1%
1.0%
22
Medical Image Analysis
33 papers in training set
Top 0.9%
0.9%
23
Frontiers in Applied Mathematics and Statistics
10 papers in training set
Top 0.3%
0.9%
24
Disaster Medicine and Public Health Preparedness
16 papers in training set
Top 1%
0.9%
25
MethodsX
14 papers in training set
Top 0.2%
0.9%
26
Journal of Public Health
23 papers in training set
Top 0.9%
0.8%
27
Acta Tropica
13 papers in training set
Top 0.7%
0.8%
28
Journal of Clinical Medicine
91 papers in training set
Top 6%
0.8%
29
Nature Communications
4913 papers in training set
Top 62%
0.8%
30
Frontiers in Physiology
93 papers in training set
Top 6%
0.8%