Back

Data Assimilation Substitutes for Biological Complexity in Hybrid Influenza Forecasting Models

Alleman, T. W.; Van Wesemael, T.; Shanker, N.; Mietchen, M. S.; Loo, S.; Ajagbe, S. O.; Baetens, J. M.; Lemaitre, J.; Hill, A. L.; Truelove, S. A.; Bento, A. I.

2026-05-27 public and global health
10.64898/2026.05.19.26353597 medRxiv
Show abstract

Hybrid mechanistic-statistical models offer interpretability and adaptability for short-term seasonal epidemic forecasting, but it remains unclear whether their accuracy depends more on increased biological complexity or on the assimilation of richer data. Using eight retrospective influenza seasons in North Carolina, we evaluate whether training on historical data and assimilating auxiliary emergency department (ED) visit data improves four-week-ahead hospital admission forecasts more than adding biological complexity (multi-subtype structure and cross-season immunity). Hierarchical Bayesian training on historical data improves accuracy by 22.4 % (95 % CI: 16.4-28.1 %), and inclusion of ED visit data yields a further 5.3 % (95 % CI: 3.0-7.6 %) improvement, whereas added biological complexity produces diminishing or null gains. We further observe a substitution effect in which ED visit data partially compensates for omitted biological structure. We deployed a simplified model variant in the 2025-2026 CDC FluSight Challenge and ranked among the top ensemble performers, supporting the robustness of Bayesian hierarchical training in real time. Together, these findings indicate that short-term forecast accuracy is driven more by historical learning and assimilating auxiliary signals than by biological fidelity, with implications for how forecasting systems should balance mechanistic complexity.

Matching journals

The top 4 journals account for 50% of the predicted probability mass.

1
PLOS Computational Biology
1633 papers in training set
Top 0.7%
22.4%
2
npj Digital Medicine
97 papers in training set
Top 0.3%
18.5%
3
Proceedings of the National Academy of Sciences
2130 papers in training set
Top 11%
6.3%
4
Nature Communications
4913 papers in training set
Top 29%
6.3%
50% of probability mass above
5
PLOS ONE
4510 papers in training set
Top 31%
4.8%
6
Scientific Reports
3102 papers in training set
Top 31%
3.9%
7
Epidemics
104 papers in training set
Top 0.4%
3.6%
8
eLife
5422 papers in training set
Top 24%
3.6%
9
Journal of The Royal Society Interface
189 papers in training set
Top 1%
3.0%
10
Nature Medicine
117 papers in training set
Top 1%
2.9%
11
Communications Biology
886 papers in training set
Top 15%
1.2%
12
Expert Systems with Applications
11 papers in training set
Top 0.2%
1.2%
13
JMIR Public Health and Surveillance
45 papers in training set
Top 2%
1.2%
14
Patterns
70 papers in training set
Top 2%
1.2%
15
Journal of Medical Internet Research
85 papers in training set
Top 4%
0.9%
16
Nature Human Behaviour
85 papers in training set
Top 4%
0.9%
17
PNAS Nexus
147 papers in training set
Top 1%
0.8%
18
American Journal of Epidemiology
57 papers in training set
Top 1%
0.7%
19
BMC Infectious Diseases
118 papers in training set
Top 5%
0.7%
20
Journal of Neural Engineering
197 papers in training set
Top 2%
0.7%
21
Frontiers in Public Health
140 papers in training set
Top 8%
0.7%
22
BMC Medicine
163 papers in training set
Top 8%
0.6%
23
Frontiers in Physiology
93 papers in training set
Top 7%
0.6%