Back

Phylogenetically estimated neutral rates and fitness effects of mutations to influenza proteins

Haddox, H. K.; Hinrichs, A. S.; Jennings-Shaffer, C.; Johnson, K.; Benton, C. T.; Galloway, J. G.; Bloom, J. D.; Matsen, F. A.

2026-05-20 bioinformatics
10.64898/2026.05.18.725477 bioRxiv
Show abstract

Influenza viruss rapid evolution is shaped by both neutral mutation and selection. Phylogenetics can be used to study these processes, but this approach has typically only been applied to a few thousand influenza genome sequences at once. Here, we built phylogenetic trees with >100,000 influenza sequences, and then used these trees to estimate neutral rates of mutations to the viruss genome. Neutral rates varied by up to ~100-fold among the 12 nucleotide mutation types (A[->]C,A[->]G, etc.). These rates were highly correlated among influenza, SARS-CoV-2, and HIV, though more nuanced context-dependent patterns showed marked differences between influenza and SARS-CoV-2. We also estimated fitness effects of mutations by comparing the number of times a mutation was observed to occur along the branches of a tree to the number of times we expect it to have occurred under neutrality. We estimated effects for ~33,000 nonsynonymous and ~8,000 synonymous mutations spanning all influenza proteins. This compendium of estimated effects helps map the relationship between sequence and fitness in a natural setting, including regions where synonymous mutations are under functional constraint, and for proteins with limited experimentally measured effects. We built interactive heatmaps of the estimated fitness effects to help readers explore these data (see https://matsen.group/flu-mut-rates). Altogether, this work places influenzas mutation rates in a broader cross-viral context and deepens our understanding of how mutation and selection shape influenza evolution in nature at a site-specific level.

Matching journals

The top 5 journals account for 50% of the predicted probability mass.

1
Virus Evolution
140 papers in training set
Top 0.1%
18.0%
2
Molecular Biology and Evolution
488 papers in training set
Top 0.2%
13.9%
3
Cell Systems
167 papers in training set
Top 1%
9.8%
4
PLOS Computational Biology
1633 papers in training set
Top 4%
8.2%
5
Proceedings of the National Academy of Sciences
2130 papers in training set
Top 9%
7.0%
50% of probability mass above
6
Nature Communications
4913 papers in training set
Top 34%
4.7%
7
eLife
5422 papers in training set
Top 26%
3.6%
8
Cell Host & Microbe
113 papers in training set
Top 2%
3.0%
9
Science
429 papers in training set
Top 11%
2.6%
10
PLOS Genetics
756 papers in training set
Top 7%
2.3%
11
Nature
575 papers in training set
Top 10%
2.0%
12
Genome Biology
555 papers in training set
Top 4%
2.0%
13
PLOS Biology
408 papers in training set
Top 8%
1.8%
14
Cell Genomics
162 papers in training set
Top 4%
1.6%
15
Current Biology
596 papers in training set
Top 11%
1.3%
16
Molecular Systems Biology
142 papers in training set
Top 0.9%
1.3%
17
Cell Reports
1338 papers in training set
Top 29%
1.2%
18
Cell
370 papers in training set
Top 14%
1.2%
19
iScience
1063 papers in training set
Top 22%
1.2%
20
Nature Genetics
240 papers in training set
Top 6%
1.1%
21
Nature Biotechnology
147 papers in training set
Top 7%
0.8%
22
Science Advances
1098 papers in training set
Top 29%
0.8%
23
Nature Methods
336 papers in training set
Top 7%
0.7%
24
Genetics
225 papers in training set
Top 5%
0.6%
25
Genome Biology and Evolution
280 papers in training set
Top 2%
0.6%
26
Nucleic Acids Research
1128 papers in training set
Top 20%
0.6%