Back

From naive to foundation: benchmarking models for epidemic forecasting

Wang, D.; Li, Y.; Perra, N.

2026-05-13 epidemiology
10.64898/2026.05.11.26352889 medRxiv
Show abstract

We systematically evaluate and compare the performance of classical statistical methods (ARIMA), mechanistic compartmental models (SEIR), modern deep learning architectures (LSTM, DLinear, Autoformer), and an emerging time-series foundation model (TabPFN-TS) to forecasts the incidence of Influenza-Like Illness (ILI) across nine European countries. The models are benchmarked against a naive baseline and a multi-model ensemble (RespiCast) created by an initiative of the ECDC. In line with the operational practice of existing forecasting hubs, our entire evaluation is explicitly optimized for short-term horizons (1 to 4 weeks ahead). Interestingly, we found that the foundation model TabPFN-TS allows for great zero-shot inference capabilities. Without any task-specific retraining, it successfully overcomes extreme data scarcity to consistently outperform all other individual architectures, frequently rivalling or surpassing the RespiCast ensemble. Our results highlight how deep learning architectures are severely constrained by extreme data scarcity, typical in epidemic forecasting, requiring targeted endogenous data augmentation to reduce predictive errors. Within the deep learning class of models, we observe that simpler architectures (such as DLinear and LSTM) frequently exhibit greater robustness and outperform complex, attention-based models (such as Autoformer) when data is constrained. Finally, our results show how a weighted ensemble, constructed by fusing all the models, delivers highly robust forecasts in all regions considered. Overall, our findings showcase the transformative potential of zero-shot foundation models in epidemic forecasting and confirm the importance of multi-model ensembles.

Matching journals

The top 8 journals account for 50% of the predicted probability mass.

1
Scientific Reports
3102 papers in training set
Top 6%
10.3%
2
PLOS Computational Biology
1633 papers in training set
Top 3%
10.3%
3
Nature Machine Intelligence
61 papers in training set
Top 0.2%
8.6%
4
Nature Communications
4913 papers in training set
Top 26%
7.0%
5
Epidemics
104 papers in training set
Top 0.3%
4.4%
6
PLOS ONE
4510 papers in training set
Top 33%
4.4%
7
npj Digital Medicine
97 papers in training set
Top 1%
4.4%
8
Nature Human Behaviour
85 papers in training set
Top 0.9%
3.7%
50% of probability mass above
9
Computers in Biology and Medicine
120 papers in training set
Top 1%
2.8%
10
Swiss Medical Weekly
12 papers in training set
Top 0.1%
2.4%
11
npj Systems Biology and Applications
99 papers in training set
Top 0.8%
2.1%
12
Proceedings of the National Academy of Sciences
2130 papers in training set
Top 29%
1.9%
13
Patterns
70 papers in training set
Top 0.6%
1.9%
14
Nature Medicine
117 papers in training set
Top 2%
1.8%
15
IEEE Journal of Biomedical and Health Informatics
34 papers in training set
Top 1%
1.7%
16
Biology Methods and Protocols
53 papers in training set
Top 1%
1.4%
17
Frontiers in Public Health
140 papers in training set
Top 7%
0.9%
18
Communications Medicine
85 papers in training set
Top 0.7%
0.9%
19
Bioinformatics Advances
184 papers in training set
Top 4%
0.9%
20
Royal Society Open Science
193 papers in training set
Top 4%
0.9%
21
Journal of Medical Internet Research
85 papers in training set
Top 4%
0.9%
22
Communications Biology
886 papers in training set
Top 20%
0.8%
23
Computational and Structural Biotechnology Journal
216 papers in training set
Top 9%
0.8%
24
BMC Infectious Diseases
118 papers in training set
Top 5%
0.8%
25
Biology
43 papers in training set
Top 3%
0.7%
26
Infectious Disease Modelling
50 papers in training set
Top 1%
0.7%
27
Biomechanics and Modeling in Mechanobiology
25 papers in training set
Top 1.0%
0.7%
28
Scientific Data
174 papers in training set
Top 3%
0.7%
29
Chaos, Solitons & Fractals
32 papers in training set
Top 2%
0.7%
30
eLife
5422 papers in training set
Top 61%
0.7%