Back

Climate-Informed Deep Learning for Spatio-Temporal Forecasting of Climate-Sensitive Diseases

Tegenaw, G. S.; Degu, M. Z.; Gebeyehu, W. B.; Senay, A. B.; Krishnamoorthy, J.; Ward, T.; Simegn, G. L.

2026-03-24 public and global health
10.64898/2026.03.20.26348930 medRxiv
Show abstract

Effective public health planning and intervention strategies necessitate an understanding of the temporal and geographic distribution of disease incidences. This requires robust frameworks for disease incidence forecasting. However, due to variations in cases and temporal dynamics, grasping the distinct patterns of climate-sensitive diseases poses significant challenges, including identifying hotspots, trends, and seasonal variations in disease incidence. Furthermore, although most studies focus on directly predicting future incidence using historical patterns and covariates, a significant gap remains between methodological proliferation marked by diverse architectures, where models are trained and validated on benchmark datasets that are standardized and statistically stable, and epidemiological reality, which is often characterized by irregular, sparse, and highly skewed data, as well as rare but high-magnitude or bimodally distributed incidences. Hence, traditional end-to-end approaches that directly map climate and disease data often fail in these data-scarce settings due to overfitting and poor generalization. To understand disease epidemiology and mitigate the impact of incidence, we analyzed a decade of retrospective datasets in Ethiopia to examine how climate and weather conditions influence the incidence or spread of climate-sensitive diseases, including malaria and dysentery. In this study, we proposed a two-stage hybrid framework, a climate-informed disease prediction model, to forecast the likelihood of disease incidences using decades of climate and weather data. First, deep learning was applied to capture latent weather dynamics. Then, a hurdle model using Extreme Gradient Boosting (XGB) was designed for zero-inflated incidence data, combining XGBClassifier to predict incidence and XGBRegressor to estimate its size, based on weather dynamics to forecast disease incidence. Our proposed multivariate climate-driven disease incidence model incorporates both spatial (elevation, coordinates) and temporal (year, month) factors, along with key weather parameters (precipitation, sunlight, wind, relative humidity, temperature) to predict the likelihood of multiple diseases occurring in each area, serving as a foundation for future disease incidence predictions in the region. Out of 72 evaluated experiments across four categories and six targets, we found that the Transformer model showed highest number of statistically significant wins (n=18, 25.0%) comparison with Long Short-Term Memory (LSTM) (n=9, 12.5%) and the Temporal Convolutional Neural Network (TCN) (n=5, 6.9%) at climate variable forecasting using Pairwise Model Comparison Diebold-Mariano Test. The hurdle model that combines XGBClassifier and XGBRegressor outperformed the baseline in both Malaria and Dysentery forecasting. Error stratification revealed that the hurdle model provided the greatest benefit during incidence periods, as indicated by a substantially lower Mean Average Error (MAE) in both incidence and non-incidence periods than the baseline. Our proposed modular pipeline first forecasts climate variables, then predicts disease incidence, thereby enhancing interpretability and generalization in data-sparse settings. Overall, this approach provides a scalable, climate-aware forecasting tool for public health planning, particularly in regions where these diseases are endemic or where climate change may affect their prevalence, as well as in data-scarce settings.

Matching journals

The top 10 journals account for 50% of the predicted probability mass.

1
PLOS Global Public Health
293 papers in training set
Top 0.9%
10.2%
2
Nature Communications
4913 papers in training set
Top 24%
7.3%
3
Scientific Reports
3102 papers in training set
Top 13%
6.9%
4
PLOS ONE
4510 papers in training set
Top 27%
6.5%
5
Frontiers in Public Health
140 papers in training set
Top 1%
4.9%
6
PLOS Computational Biology
1633 papers in training set
Top 8%
4.4%
7
Malaria Journal
48 papers in training set
Top 0.5%
4.4%
8
Infectious Diseases of Poverty
10 papers in training set
Top 0.1%
2.9%
9
BMC Medicine
163 papers in training set
Top 2%
2.1%
10
npj Digital Medicine
97 papers in training set
Top 2%
2.1%
50% of probability mass above
11
Science of The Total Environment
179 papers in training set
Top 3%
2.1%
12
Journal of Medical Internet Research
85 papers in training set
Top 2%
1.9%
13
Patterns
70 papers in training set
Top 0.6%
1.9%
14
eLife
5422 papers in training set
Top 39%
1.8%
15
Epidemics
104 papers in training set
Top 0.9%
1.7%
16
Nature Medicine
117 papers in training set
Top 2%
1.7%
17
BMC Infectious Diseases
118 papers in training set
Top 3%
1.7%
18
PLOS Neglected Tropical Diseases
378 papers in training set
Top 3%
1.5%
19
Heliyon
146 papers in training set
Top 3%
1.3%
20
eBioMedicine
130 papers in training set
Top 2%
1.3%
21
Expert Systems with Applications
11 papers in training set
Top 0.2%
1.3%
22
The American Journal of Tropical Medicine and Hygiene
60 papers in training set
Top 3%
1.2%
23
Communications Biology
886 papers in training set
Top 14%
1.2%
24
Environmental Research
46 papers in training set
Top 1%
1.1%
25
GeoHealth
10 papers in training set
Top 0.5%
1.0%
26
The Innovation
12 papers in training set
Top 0.6%
1.0%
27
BMJ Global Health
98 papers in training set
Top 2%
0.8%
28
JMIR Public Health and Surveillance
45 papers in training set
Top 3%
0.8%
29
The Lancet Regional Health - Western Pacific
15 papers in training set
Top 0.2%
0.8%
30
International Journal of Medical Informatics
25 papers in training set
Top 2%
0.8%