Scalable deep-learning-based inference of time-varying transmission dynamics from outbreak phylogenies
XIE, R.; Zhukova, A.; Pena, P. G.; Iglesias, G.; Hu, S.; Wang, J.; Tsang, T. K.; Dhanasekaran, V.; Kraemer, M. U. G.; Pybus, O. G.; Gascuel, O.
Show abstract
Infectious disease dynamics can be inferred from pathogen genomic data using phylodynamic methods, but the applicability of many such approaches to large data sets is constrained by computational cost. Recent deep-learning approaches to phylodynamics have improved scalability, yet challenges remain when genetic divergence is limited during fast spreading outbreaks. To address this, we use pathogen-specific models to show that deep-learning models trained on outbreak-like phylogenies can accurately estimate the reproductive number (R) when both the birth-death model and the expected phylogenetic resolution are matched to the target pathogen, highlighting the importance of realistic training conditions. Focusing on three major respiratory pathogens of public health importance (SARS-CoV-2, seasonal human influenza virus, and respiratory syncytial virus (RSV)), we introduce PhyloRt, a scalable framework for estimating the time-varying reproductive number (Rt) from large outbreak phylogenies. PhyloRt decomposes large trees into overlapping subtrees and applies a hierarchical deep-learning-based inference strategy to classify subtrees as exhibiting constant or time-varying reproduction numbers, enabling identifiable and computationally efficient estimation of Rt as a piecewise-constant trajectory through time. Applications to SARS-CoV-2 and influenza outbreaks show that PhyloRt recovers transmission dynamics consistent with estimates derived from mathematical epidemiological and Bayesian phylodynamic analyses. Our work enables scalable and rapid estimation of time-varying transmission dynamics from very large-scale outbreak genomic data sets, supporting real-time genomic epidemiology of emerging pathogens. SignificanceEstimating changes in transmission dynamics over time is important for responding to infectious disease outbreaks. Current methods mostly rely on reported case data from epidemiological surveillance, which can be biased or incomplete due to variable testing capabilities, particularly in resource-limited settings. A complementary approach is to use viral genomes as an alternative data source. However, inferences from genomic data can be computationally intensive and have mainly been applied retrospectively. We present PhyloRt, a scalable deep-learning-based phylodynamic framework that enables fast inference of the time-varying reproductive number (Rt) from large outbreak phylogenies. Our approach is widely applicable and provides a practical approach to monitoring epidemic dynamics, complementing traditional surveillance and supporting timely public health decision-making.
Matching journals
The top 4 journals account for 50% of the predicted probability mass.