Back

Synonymous substitution rate slowdown preceding the emergence of SARS-CoV-2 variants and during persistent infections

Havens, J. L.; Gangavarapu, K.; Wang, J. C.; Taki, F.; Luoma, E.; Pekar, J. E.; Amin, H.; Di Lonardo, S.; Omoregie, E.; Hughes, S.; Andersen, K. G.; Vasylyeva, T. I.; Suchard, M. A.; Wertheim, J. O.

2026-01-28 epidemiology
10.64898/2026.01.26.26344861 medRxiv
Show abstract

The emergence of variants has shaped the COVID-19 pandemic. The lack of directly observed precursors to these variants has led to proposals that variants emerge from either persistent infections, transmission in non-human animal populations after reverse-zoonosis, or cryptic transmission in the human population. We investigated the origin of variants by analyzing the molecular clock and rate of nonsynonymous and synonymous substitutions in SARS-CoV-2 circulating in human population, persistently infected individuals, non-human animals, and along variant stems: the branches preceding emergence of SARS-CoV-2 variants (Alpha, Beta, Gamma, Delta, Epsilon, Iota, B.1.637, Mu, and Omicron: BA.1, BA.2/BA.4/BA.5). Along the variant stems we find evidence for an acceleration in the non-synonymous substitution rate, as compared with non-synonymous substitution rate along the branches that represent the genetic diversity of circulating virus. We also find evidence for a slowdown in the synonymous substitution rate preceding the emergence of multiple named variants (e.g., Beta, Delta, Iota, Mu, Omicron BA.1); a similar pattern was observed in some individuals with persistent infections, suggesting that the viral replication rate can slow down during persistent infection. However, the synonymous rate slowdown was not observed for all variants, with some exhibiting an increase in synonymous substitution rates preceding their emergence compared with typical viral transmission (e.g., Alpha, Epsilon). The similarity in evolutionary dynamics preceding some variant emergence and during persistent infections supports the hypothesis that persistent infections were the likely source of many COVID-19 variants.

Matching journals

The top 4 journals account for 50% of the predicted probability mass.

1
Science
429 papers in training set
Top 0.2%
22.7%
2
Virus Evolution
140 papers in training set
Top 0.1%
14.8%
3
Proceedings of the National Academy of Sciences
2130 papers in training set
Top 4%
12.4%
4
Molecular Biology and Evolution
488 papers in training set
Top 0.6%
6.9%
50% of probability mass above
5
Nature Communications
4913 papers in training set
Top 28%
6.4%
6
eLife
5422 papers in training set
Top 21%
4.2%
7
Scientific Reports
3102 papers in training set
Top 41%
3.1%
8
Science Advances
1098 papers in training set
Top 11%
2.4%
9
Current Biology
596 papers in training set
Top 8%
2.1%
10
Nature
575 papers in training set
Top 10%
1.9%
11
Cell Reports
1338 papers in training set
Top 22%
1.9%
12
Emerging Infectious Diseases
103 papers in training set
Top 2%
1.5%
13
Science Translational Medicine
111 papers in training set
Top 5%
0.8%
14
PLOS Biology
408 papers in training set
Top 19%
0.8%
15
Nature Medicine
117 papers in training set
Top 5%
0.8%
16
mBio
750 papers in training set
Top 11%
0.8%
17
PLOS Genetics
756 papers in training set
Top 15%
0.8%
18
Cell
370 papers in training set
Top 17%
0.7%
19
Proceedings of the Royal Society B: Biological Sciences
341 papers in training set
Top 7%
0.7%
20
Communications Biology
886 papers in training set
Top 26%
0.7%
21
iScience
1063 papers in training set
Top 34%
0.7%
22
Viruses
318 papers in training set
Top 6%
0.6%
23
PLOS ONE
4510 papers in training set
Top 71%
0.6%
24
PNAS Nexus
147 papers in training set
Top 3%
0.6%
25
Epidemics
104 papers in training set
Top 2%
0.5%
26
Nature Genetics
240 papers in training set
Top 9%
0.5%