Simpler is not always better: Phylodynamic misspecification and deep-learning corrections
XIE, R.; Gascuel, O.; ZHUKOVA, A.
Show abstract
Phylodynamics bridges the gap between epidemiology and pathogen genetic data by estimating epidemiological parameters from time-scaled pathogen phylogenies. Multi-type birth-death (MTBD) models are phylodynamic analogies of compartmental models in classical epidemiology. They serve to infer the average number of secondary infections R and the infection duration d. Moreover, more complex MTBD models add extra parameters, such as the average length of the incubation period or the proportion of superspreaders in the infected population. However, these additional parameters come at an important computational cost: Apart from the simplest, BD, model, MTBD models do not have a closed-form solution and require numerical methods for their likelihood computation. This leads to increased computational times and potential numerical errors. Therefore, the BD model remains the favorite researchers choice for real dataset analyses, and is often applied even in cases where more complex epidemiological aspects are present. We investigated, using simulations, how model misspecification influences inference of R and d in the phylodynamic framework. We showed that the use of models not accounting for various epidemiological aspects leads to bias. In particular the simplest, BD, estimator tends to underestimate R in the presence of super-spreading or incubation, which might be dangerous from the public health prospective. However, deep-learning-based estimators for complex models, which account for multiple epidemiological factors, perform well both on the data where those factors are present and where they are absent. This advocates for the use of complex epidemiologically realistic estimators, whose design has recently become possible thanks to deep learning.
Matching journals
The top 3 journals account for 50% of the predicted probability mass.