Back

Basic Baseline model design choices can substantially influence performance in collaborative forecast hubs

Suez, E.; Fox, S. J.

2026-03-20 epidemiology
10.64898/2026.03.18.26348748 medRxiv
Show abstract

Over the past decade, outbreak forecasting has become an increasingly used tool to assist public health decision-making during epidemics. Collaborative forecast hubs, where multiple teams submit predictions in real-time, are the gold standard for such efforts. For each hub, a Baseline model is used as a performance benchmark for other models. Although the Baseline is understood as a naive forecast, its design is subjective, and the impact of model design decisions remains understudied. We evaluated how three Baseline specification decisions influence forecast performance on trend models that forecast based on historically observed dynamics: (1) the amount of historical data used for training, (2) whether the data are transformed, and (3) whether forecasts follow a flatline variant (constant predictions) or a drift variant (allowing a slope). Retrospective forecasts were generated for multiple years across four surveillance targets: COVID-19, influenza and RSV hospital admissions, and weighted influenza-like illness percentage. For wILI, we additionally compared trend baselines with a seasonal baseline model leveraging long-term historical patterns. Model specification significantly altered performance. The optimal performing model across targets was a flatline model that used the most recent 6-12 transformed observations. The optimal model outperforms the current standard Baseline used in many forecast hubs by an average of 9.6% (range: 3.7-12.9%) across forecast targets, and it outperformed the seasonal baseline model by 32.3% across nine influenza seasons. Our results demonstrate that subjective Baseline design decisions can materially influence forecast accuracy and, consequently, the perceived rankings of models within collaborative forecast hubs. Based on the varying approaches and their performance differences, these findings highlight the need for increased transparency in Baseline model specifications and support the routine inclusion of multiple benchmark models within collaborative forecast hubs.

Matching journals

The top 4 journals account for 50% of the predicted probability mass.

1
PLOS Computational Biology
1633 papers in training set
Top 0.4%
26.5%
2
Epidemics
104 papers in training set
Top 0.1%
10.3%
3
npj Digital Medicine
97 papers in training set
Top 0.6%
9.4%
4
PLOS ONE
4510 papers in training set
Top 30%
5.0%
50% of probability mass above
5
Scientific Reports
3102 papers in training set
Top 30%
4.1%
6
BMC Infectious Diseases
118 papers in training set
Top 0.8%
4.1%
7
Journal of Medical Internet Research
85 papers in training set
Top 1%
4.0%
8
Journal of The Royal Society Interface
189 papers in training set
Top 1%
3.1%
9
Infectious Disease Modelling
50 papers in training set
Top 0.6%
2.1%
10
Nature Communications
4913 papers in training set
Top 48%
1.9%
11
Frontiers in Public Health
140 papers in training set
Top 4%
1.7%
12
BMC Medical Research Methodology
43 papers in training set
Top 0.6%
1.7%
13
Proceedings of the National Academy of Sciences
2130 papers in training set
Top 37%
1.3%
14
Wellcome Open Research
57 papers in training set
Top 1%
1.3%
15
eLife
5422 papers in training set
Top 48%
1.3%
16
International Journal of Medical Informatics
25 papers in training set
Top 1%
1.0%
17
JMIR Public Health and Surveillance
45 papers in training set
Top 3%
0.9%
18
Epidemiology
26 papers in training set
Top 0.5%
0.8%
19
Computers in Biology and Medicine
120 papers in training set
Top 4%
0.8%
20
BMC Bioinformatics
383 papers in training set
Top 7%
0.8%
21
Royal Society Open Science
193 papers in training set
Top 5%
0.8%
22
Patterns
70 papers in training set
Top 3%
0.7%
23
Influenza and Other Respiratory Viruses
44 papers in training set
Top 0.5%
0.7%
24
npj Systems Biology and Applications
99 papers in training set
Top 3%
0.7%
25
American Journal of Epidemiology
57 papers in training set
Top 2%
0.5%
26
IEEE Journal of Biomedical and Health Informatics
34 papers in training set
Top 3%
0.5%
27
BMC Medicine
163 papers in training set
Top 9%
0.5%
28
mSystems
361 papers in training set
Top 8%
0.5%
29
PLOS Biology
408 papers in training set
Top 24%
0.5%