Back

The Representativeness of Regional Influenza Virus Genomic Surveillance for National Trends in the United States

Ragonnet-Cronin, M.; Papalambros, L.; Bendall, E. E.; Kitzsimmons, W. J.; Blair, C. N.; Tibbetts, R.; Bhargava, A.; Lauring, A.

2026-03-02 infectious diseases
10.64898/2026.02.23.26346422 medRxiv
Show abstract

Genomic surveillance of influenza viruses informs vaccine strain selection and evolutionary forecasting. Sequencing efforts vary widely across U.S. states, which raises concerns about spatial sampling bias. We evaluated how well 10,958 influenza virus genomes sampled by our group in Michigan captured the genetic diversity in 34,743 genomes circulating nationally from the 2021/22 through 2024/25 seasons. We defined seasonal hemagglutinin haplotypes and tracked their detection across states. A small number of haplotypes dominated each season, and Michigan detected all major haplotypes, even under substantial downsampling. Detection delays were primarily driven by haplotype frequency rather than geographic factors. Comparisons across states showed that higher sequencing effort improved coverage and detection timeliness, with diminishing returns at higher volumes. Rarefaction analysis confirmed that relatively few sequences were needed to capture 95% of national haplotype diversity. These findings suggest that intensive sequencing in a single well-sampled location can be broadly representative of national influenza diversity. One sentence summaryDense influenza genomic sequencing from a single U.S. state captured nearly all nationally circulating haplotype diversity, with detection timeliness primarily driven by sequencing effort and haplotype frequency.

Matching journals

The top 5 journals account for 50% of the predicted probability mass.

1
Cell
370 papers in training set
Top 0.2%
17.9%
2
Science
429 papers in training set
Top 0.9%
16.9%
3
Nature
575 papers in training set
Top 4%
7.9%
4
Science Translational Medicine
111 papers in training set
Top 0.2%
6.6%
5
Clinical Infectious Diseases
231 papers in training set
Top 1%
4.7%
50% of probability mass above
6
Nature Medicine
117 papers in training set
Top 0.7%
3.8%
7
Med
38 papers in training set
Top 0.1%
3.8%
8
Nature Communications
4913 papers in training set
Top 41%
3.5%
9
Emerging Infectious Diseases
103 papers in training set
Top 0.9%
2.5%
10
The Lancet Infectious Diseases
71 papers in training set
Top 1%
2.3%
11
Nature Genetics
240 papers in training set
Top 4%
1.7%
12
Nature Biotechnology
147 papers in training set
Top 5%
1.6%
13
Nature Microbiology
133 papers in training set
Top 3%
1.6%
14
Cell Systems
167 papers in training set
Top 8%
1.4%
15
Immunity
58 papers in training set
Top 3%
1.3%
16
Proceedings of the National Academy of Sciences
2130 papers in training set
Top 37%
1.3%
17
eLife
5422 papers in training set
Top 48%
1.3%
18
Cell Host & Microbe
113 papers in training set
Top 4%
1.1%
19
New England Journal of Medicine
50 papers in training set
Top 0.7%
0.9%
20
mBio
750 papers in training set
Top 10%
0.9%
21
Scientific Reports
3102 papers in training set
Top 74%
0.8%
22
eBioMedicine
130 papers in training set
Top 4%
0.8%
23
The Journal of Infectious Diseases
182 papers in training set
Top 5%
0.8%
24
Genome Medicine
154 papers in training set
Top 8%
0.8%
25
Virus Evolution
140 papers in training set
Top 1%
0.7%
26
mSphere
281 papers in training set
Top 7%
0.7%
27
Nature Computational Science
50 papers in training set
Top 2%
0.7%
28
Communications Biology
886 papers in training set
Top 28%
0.7%
29
PLOS Pathogens
721 papers in training set
Top 10%
0.7%
30
Cell Reports Medicine
140 papers in training set
Top 9%
0.7%