Back

Estimating disease spread using structured coalescent andbirth-death models: A quantitative comparison

Seidel, S.; Stadler, T.; Vaughan, T.

2020-11-30 bioinformatics
10.1101/2020.11.30.403741 bioRxiv
Show abstract

Understanding how disease transmission occurs between subpopulations is critically important for guiding disease control efforts irrespective of whether the subpopulations represent geographically separated people, age or risk groups. The structured coalescent (SC) and the multitype birth-death (MBD) model can both be used to infer migration rates between subpopulations from phylogenies reconstructed from pathogen genetic sequences. However, the two classes of phylodynamic methods rely on different assumptions. Here, we report on a simulation study which compares inferences made using these models for a variety of migration rates in both endemic diseases and epidemic outbreaks. For the epidemic outbreak, we found that the MBD recovers the true migration rates better than the SC regardless of migration rate. We hypothesize that the inaccurate SC estimates stem from the its assumption of a constant population size. For the endemic scenario, our analysis shows that both models obtain a similar coverage of the migration rates, while the SC provides slightly narrower posterior intervals. Irrespective of the scenario, both models estimate the root location with similar coverage. Our study provides concrete modelling advice for infectious disease analysts. For endemic disease either model can be used, while for epidemic outbreaks the MBD should be the model of choice. Additionally, our study reveals the need to develop the SC further such that varying population sizes can easily be taken into account. Author summaryControlling an infectious disease requires us to quantify and understand how it spreads through pools of susceptible individuals, defined by their belonging to different geographical regions, age or risk groups. Rates of pathogen movement between these pools can be inferred from pathogen phylogenies which are themselves reconstructed from pathogen genetic sequences collected from infected individuals. Two popular foundations for such models are the multitype birth-death model and the structured coalescent. Although these models fulfill the same purpose, they differ in their assumptions and can, hence, produce contrasting results. To assess the appropriateness of the models in different situations, we performed a simulation study. We find that, for endemic diseases, both models are able to estimate the migration parameters reliably. For epidemic outbreaks, however, the multitype birth-death model obtains better estimates of the migration rates. We hypothesize that the structured coalescents inaccurate estimates for the epidemic scenario arise because it assumes a constant number of infected individuals through time.

Matching journals

The top 4 journals account for 50% of the predicted probability mass.

1
PLOS Computational Biology
1633 papers in training set
Top 0.2%
32.8%
2
Theoretical Population Biology
47 papers in training set
Top 0.1%
10.0%
3
PLOS ONE
4510 papers in training set
Top 25%
6.8%
4
Methods in Ecology and Evolution
160 papers in training set
Top 0.9%
3.6%
50% of probability mass above
5
PLOS Genetics
756 papers in training set
Top 5%
3.6%
6
Scientific Reports
3102 papers in training set
Top 40%
3.2%
7
Genetics
225 papers in training set
Top 1%
3.0%
8
Bioinformatics
1061 papers in training set
Top 6%
2.7%
9
PeerJ
261 papers in training set
Top 4%
2.7%
10
Journal of Theoretical Biology
144 papers in training set
Top 0.7%
2.1%
11
Journal of The Royal Society Interface
189 papers in training set
Top 2%
2.1%
12
Peer Community Journal
254 papers in training set
Top 1%
2.1%
13
BMC Bioinformatics
383 papers in training set
Top 4%
2.1%
14
G3 Genes|Genomes|Genetics
351 papers in training set
Top 1%
1.7%
15
Epidemics
104 papers in training set
Top 1.0%
1.7%
16
Heredity
53 papers in training set
Top 0.1%
1.3%
17
GENETICS
189 papers in training set
Top 1%
0.9%
18
Journal of Open Source Software
22 papers in training set
Top 0.2%
0.7%
19
G3: Genes, Genomes, Genetics
222 papers in training set
Top 1.0%
0.7%
20
Biology Methods and Protocols
53 papers in training set
Top 3%
0.7%
21
Royal Society Open Science
193 papers in training set
Top 5%
0.7%
22
Physical Biology
43 papers in training set
Top 2%
0.7%
23
Frontiers in Genetics
197 papers in training set
Top 11%
0.6%
24
F1000Research
79 papers in training set
Top 6%
0.6%
25
Systematic Biology
121 papers in training set
Top 0.5%
0.6%