Back

Automated Model Discovery Based on COVID-19 Epidemiologic Data

Babazadeh Shareh, M.; Kleiner, F.; Böhme, M.; Hägele, C.; Dickmann, P.; Heintzmann, R.

2026-02-24 epidemiology
10.64898/2026.02.22.26346850 medRxiv
Show abstract

The COVID-19 pandemic has presented severe challenges in understanding and predicting the spread of infectious diseases, necessitating innovative approaches beyond traditional epidemiological models. This study introduces an advanced method for automated model discovery using the Sparse Identification of Nonlinear Dynamics (SINDy) algorithm, leveraging a dataset from the COVID-19 outbreak in Thuringia, Germany, encompassing over 400,000 patient records and vaccination data. By analysing this dataset, we develop a flexible, data-driven model that captures many aspects of the complex dynamics of the pandemics spread. Our approach incorporates external factors and interventions into the mathematical framework, leading to more accurate modelling of the pandemics behaviour. The fixed coefficient values of the differential equation as globally determined by the SINDy were not found to be accurate for locally modelling the measured data. We therefore refined our technique based on the differential equations as found by SINDy, by investigating three modifications that account for recent local data. In a first approach, we re-optimized the coefficient values using seven days of past data, without changing the globally determined differential equation. In a second approach, we allowed a temporal dependence of the coefficient values fitted using all previous data in combination with regularization. As a last method, we kept the coefficients fixed to the original values but augmented the differential equation with a small neural network, locally optimized to the data of the past week. Our findings reveal the critical role of vaccination and public health measures in the pandemics trajectory. The proposed model offers a robust tool for policymakers and health professionals to mitigate future outbreaks, providing insights into the efficacy of intervention strategies and vaccination campaigns. This study advances the understanding of COVID-19 dynamics and lays the groundwork for future research in epidemic modelling, emphasising the importance of adaptive, data-informed approaches in public health planning.

Matching journals

The top 6 journals account for 50% of the predicted probability mass.

1
PLOS Computational Biology
1633 papers in training set
Top 1%
18.8%
2
Scientific Reports
3102 papers in training set
Top 6%
10.2%
3
Epidemics
104 papers in training set
Top 0.2%
6.9%
4
PLOS ONE
4510 papers in training set
Top 27%
6.4%
5
Journal of The Royal Society Interface
189 papers in training set
Top 0.5%
6.4%
6
Infectious Disease Modelling
50 papers in training set
Top 0.3%
4.9%
50% of probability mass above
7
Frontiers in Physics
20 papers in training set
Top 0.1%
2.6%
8
Royal Society Open Science
193 papers in training set
Top 1%
2.4%
9
Swiss Medical Weekly
12 papers in training set
Top 0.1%
1.9%
10
Epidemiology and Infection
84 papers in training set
Top 1%
1.7%
11
Chaos, Solitons & Fractals
32 papers in training set
Top 1%
1.5%
12
Mathematics
11 papers in training set
Top 0.1%
1.5%
13
Computers in Biology and Medicine
120 papers in training set
Top 3%
1.2%
14
Biology Methods and Protocols
53 papers in training set
Top 1%
1.2%
15
Heliyon
146 papers in training set
Top 5%
0.9%
16
Viruses
318 papers in training set
Top 4%
0.9%
17
BMC Medical Research Methodology
43 papers in training set
Top 1%
0.8%
18
BMC Infectious Diseases
118 papers in training set
Top 5%
0.8%
19
PeerJ
261 papers in training set
Top 13%
0.8%
20
Journal of Theoretical Biology
144 papers in training set
Top 2%
0.8%
21
Mathematical Biosciences
42 papers in training set
Top 1%
0.8%
22
Physical Biology
43 papers in training set
Top 2%
0.8%
23
Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences
12 papers in training set
Top 0.1%
0.8%
24
BMC Public Health
147 papers in training set
Top 6%
0.8%
25
Nature Communications
4913 papers in training set
Top 64%
0.7%
26
BMC Bioinformatics
383 papers in training set
Top 7%
0.7%
27
Microorganisms
101 papers in training set
Top 3%
0.7%
28
Biomechanics and Modeling in Mechanobiology
25 papers in training set
Top 0.9%
0.7%
29
Frontiers in Public Health
140 papers in training set
Top 8%
0.7%
30
Bulletin of Mathematical Biology
84 papers in training set
Top 2%
0.7%