Back

A Machine Learning Framework for Constructing Heterogeneous Contact Networks: Implications for Epidemic Modelling

Murray Kearney, L.; Davis, E. L.; Keeling, M. J.

2026-03-16 epidemiology
10.64898/2026.03.14.26348396 medRxiv
Show abstract

Capturing the structured mixing within a population is key to the reliable projection of infectious disease dynamics and hence informed control. Both heterogeneity in the number of epidemiologically-relevant contacts and age-structured mixing have been repeatedly demonstrated as fundamental, yet are rarely combined. Networks provide a powerful and intuitive method to realise these two elements of population structure, and simulate infection dynamics. While there are a few key examples of contact networks being measured explicitly, this is not scalable to larger populations, where representative networks must be constructed from more ubiquitous individual-level data. Here, using data from social contact surveys, we develop a generalisable and robust algorithm utilizing machine learning to generate a surrogate population-scale network that preserves both age-structured mixing and heterogeneity of contacts. For different datasets and network construction assumptions we simulate the spread of infection, considering how the epidemic size varies over basic reproduction number (R0) scenarios - mirroring the process of determining public health impact from early epidemic growth. Our approach shows that both age structure and degree heterogeneity substantially reduce the epidemic size (for a given R0) compared to simpler models. We also demonstrate that these simulations more accurately re-capture the heterogeneity in secondary cases that has been observed, when transmission is scaled by contact duration to dampen the effect of highly connected nodes ("super-spreaders"). By using survey data collected during 2020-2022, these network models also inform about the impacts of control and targeting of public health interventions: quantifying the non-linear reduction in transmission opportunities that occurred during lockdowns, and the ages and contact types most responsible for onward transmission. Our robust methodology therefore allows for the inclusion of the full wealth of data commonly collected by surveys but frequently overlooked to be incorporated into more realistic transmission models of infectious diseases.

Matching journals

The top 3 journals account for 50% of the predicted probability mass.

1
Epidemics
104 papers in training set
Top 0.1%
22.4%
2
PLOS Computational Biology
1633 papers in training set
Top 1%
18.5%
3
Journal of The Royal Society Interface
189 papers in training set
Top 0.1%
12.6%
50% of probability mass above
4
Proceedings of the National Academy of Sciences
2130 papers in training set
Top 11%
6.3%
5
Scientific Reports
3102 papers in training set
Top 28%
4.3%
6
Nature Communications
4913 papers in training set
Top 37%
3.9%
7
eLife
5422 papers in training set
Top 36%
2.1%
8
Proceedings of the Royal Society B: Biological Sciences
341 papers in training set
Top 4%
1.7%
9
Physical Review X
23 papers in training set
Top 0.2%
1.7%
10
Science
429 papers in training set
Top 14%
1.7%
11
Science Advances
1098 papers in training set
Top 17%
1.7%
12
PLOS ONE
4510 papers in training set
Top 57%
1.5%
13
Nature
575 papers in training set
Top 13%
1.2%
14
Epidemiology
26 papers in training set
Top 0.4%
1.2%
15
American Journal of Epidemiology
57 papers in training set
Top 1.0%
1.2%
16
Nature Medicine
117 papers in training set
Top 4%
0.9%
17
Physical Biology
43 papers in training set
Top 2%
0.9%
18
Royal Society Open Science
193 papers in training set
Top 4%
0.8%
19
Statistics in Medicine
34 papers in training set
Top 0.3%
0.8%
20
Philosophical Transactions of the Royal Society B
51 papers in training set
Top 6%
0.7%
21
PLOS Digital Health
91 papers in training set
Top 3%
0.7%
22
Physical Review Research
46 papers in training set
Top 0.9%
0.7%
23
Bulletin of Mathematical Biology
84 papers in training set
Top 2%
0.7%
24
Philosophical Transactions of the Royal Society B: Biological Sciences
53 papers in training set
Top 2%
0.6%