Back

Enlarging viral mutation estimation: a view from the distribution of mutation rates

Furuyama, T. N.; de Carvalho Mello, I. M. V. G.; Janini, L. M. R.; Antoneli, F. M.

2025-03-20 microbiology
10.1101/2025.03.20.644362 bioRxiv
Show abstract

The problem of empirical estimation of mutation rates is fundamental for the understanding of viral evolution. The estimation of viral mutation rates is based on varied and often complex methods carried out through experiments essentially designed to count mutation frequencies. Mutation rates are defined as the probabilities of nucleotide substitutions, typically reported as a single number in units of mutation (substitution) per base (nucleotide) per replication cycle or per cell infection, depending on the replication mode of the virus. Even more, the uncertainty quantification of these estimates is so difficult that it is rare to find it reported in the literature. The values for the same virus reported in literature fall within a broad range, sometimes spanning two orders of magnitude. For instance, the mutation rates range from 10-8 to 10-6 mutation per base per cell infection for DNA viruses and from 10-6 to 10-4 mutation per base per cell infection for RNA viruses. In this paper, we propose an alternative perspective on the estimation of mutational rates, which avoids the use of consensus sequences and/or serial passages. Our approach leverages the large amount of sequencing data produced by high throughput sequencing technologies coupled to an experimental design that performs a single replication cycle from an initial clonal viral population. We propose to replace the single numeric mutation rate with a distribution of mutation rates (DMR), together with a procedure to implement the estimation of this distribution from sequencing data and show that it can be estimated from sequencing data. Even though the focus of this paper is the development of the approach centered on the DMR it is straightforward to produce point and interval estimates of the mutation rates, including uncertainty quantification. In addition to the estimation of the DMR, we provide a theoretical characterization of it, as being well-approximated by a log-normal distribution. Finally, we study some non-trivial properties of the DMR related to a remarkable invariance under down-scaling the distribution from the genome to its subunits.

Matching journals

The top 6 journals account for 50% of the predicted probability mass.

1
PLOS Computational Biology
1633 papers in training set
Top 1%
17.2%
2
Virus Evolution
140 papers in training set
Top 0.1%
17.2%
3
Journal of The Royal Society Interface
189 papers in training set
Top 0.6%
6.3%
4
Scientific Reports
3102 papers in training set
Top 28%
4.2%
5
Mathematics
11 papers in training set
Top 0.1%
3.9%
6
Viruses
318 papers in training set
Top 2%
3.0%
50% of probability mass above
7
Journal of Theoretical Biology
144 papers in training set
Top 0.5%
2.7%
8
Bulletin of Mathematical Biology
84 papers in training set
Top 0.8%
2.4%
9
Proceedings of the National Academy of Sciences
2130 papers in training set
Top 28%
2.0%
10
Peer Community Journal
254 papers in training set
Top 1%
2.0%
11
iScience
1063 papers in training set
Top 11%
2.0%
12
PLOS ONE
4510 papers in training set
Top 52%
1.8%
13
Nature Communications
4913 papers in training set
Top 54%
1.5%
14
mSystems
361 papers in training set
Top 5%
1.5%
15
Mathematical Biosciences and Engineering
23 papers in training set
Top 0.4%
1.3%
16
Frontiers in Microbiology
375 papers in training set
Top 6%
1.3%
17
Cell Systems
167 papers in training set
Top 8%
1.3%
18
Infectious Disease Modelling
50 papers in training set
Top 1.0%
1.2%
19
Bioinformatics
1061 papers in training set
Top 8%
1.2%
20
Physical Review E
95 papers in training set
Top 0.9%
1.2%
21
Frontiers in Immunology
586 papers in training set
Top 6%
1.1%
22
Communications Biology
886 papers in training set
Top 19%
0.9%
23
RNA
169 papers in training set
Top 0.5%
0.7%
24
Cell Reports
1338 papers in training set
Top 34%
0.7%
25
IEEE/ACM Transactions on Computational Biology and Bioinformatics
32 papers in training set
Top 0.7%
0.7%
26
Nucleic Acids Research
1128 papers in training set
Top 19%
0.7%
27
Evolution, Medicine, and Public Health
14 papers in training set
Top 0.4%
0.6%
28
Journal of Infection and Public Health
15 papers in training set
Top 1.0%
0.6%
29
Chaos: An Interdisciplinary Journal of Nonlinear Science
16 papers in training set
Top 0.4%
0.6%
30
Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences
12 papers in training set
Top 0.1%
0.6%