Back

A novel long-amplicon rpoB primer pair for high resolution microbiome analysis at the species-level

Venbrux, M.; Crauwels, S.; Rediers, H.

2026-05-17 molecular biology
10.64898/2026.05.15.725465 bioRxiv
Show abstract

The 16S rRNA gene is the most widely used genetic marker for microbial community profiling, but its limited sequence divergence often prevents species-level identification. The RNA polymerase {beta}-subunit gene (rpoB) offers higher sequence variability, single-copy occurrence, and stronger phylogenetic consistency, yet its adoption in metataxonomic studies has been constrained by the lack of universal primer sets. Here, we present a novel universal primer pair that amplifies an [~]1,800 bp rpoB region (rpoB_MV) compatible with long-read sequencing platforms. In silico evaluation across 17683 bacterial reference genomes demonstrated high universality, with over 86% of genomes predicted to amplify. Compared with full-length and partial 16S rRNA gene markers, the rpoB_MV amplicon exhibited significantly greater inter-species sequence divergence and improved phylogenetic concordance with core-genome trees. Sequencing of two complementary mock communities confirmed superior species-level identification accuracy, with misclassification rates below 0.01% and no reads assigned to unresolved species clusters. These results establish rpoB_MV as a robust alternative to 16S rRNA gene-based profiling for high-resolution metataxonomic applications. IMPORTANCEMicrobial community studies increasingly require species-level resolution because species within the same genus can differ substantially in pathogenicity, ecological function, and metabolic capacity. Current 16S rRNA gene-based methods frequently fail to distinguish closely related species, collapsing biologically distinct organisms into the same taxonomic assignment and obscuring community differences that matter for clinical diagnostics, food safety, and environmental monitoring. The rpoB_MV primer pair presented here overcomes this limitation by targeting a longer, more variable region of the rpoB gene, enabling accurate species-level identification across diverse bacterial phyla. Combined with advances in long-read sequencing, this approach provides researchers with a practical tool to resolve microbial communities at the species-level.

Matching journals

The top 8 journals account for 50% of the predicted probability mass.

1
Nature Communications
4913 papers in training set
Top 22%
8.4%
2
Journal of Clinical Microbiology
120 papers in training set
Top 0.3%
8.4%
3
Microbial Genomics
204 papers in training set
Top 0.2%
8.4%
4
mBio
750 papers in training set
Top 3%
6.4%
5
mSphere
281 papers in training set
Top 0.8%
4.9%
6
ISME Communications
103 papers in training set
Top 0.3%
4.9%
7
Scientific Reports
3102 papers in training set
Top 23%
4.9%
8
mSystems
361 papers in training set
Top 2%
4.3%
50% of probability mass above
9
Applied and Environmental Microbiology
301 papers in training set
Top 0.6%
4.3%
10
Nucleic Acids Research
1128 papers in training set
Top 5%
4.0%
11
PLOS ONE
4510 papers in training set
Top 39%
3.6%
12
Frontiers in Microbiology
375 papers in training set
Top 3%
3.6%
13
Microbiology Spectrum
435 papers in training set
Top 2%
2.1%
14
Water Research
74 papers in training set
Top 0.8%
1.8%
15
Nature Microbiology
133 papers in training set
Top 3%
1.5%
16
Environmental Microbiome
26 papers in training set
Top 0.3%
1.3%
17
Genome Biology
555 papers in training set
Top 6%
1.2%
18
Environmental Microbiology
119 papers in training set
Top 2%
1.1%
19
Briefings in Bioinformatics
326 papers in training set
Top 5%
1.0%
20
BMC Bioinformatics
383 papers in training set
Top 7%
0.7%
21
eLife
5422 papers in training set
Top 58%
0.7%
22
BMC Microbiology
35 papers in training set
Top 1%
0.7%
23
Molecular Ecology Resources
161 papers in training set
Top 1%
0.7%
24
Gut Microbes
70 papers in training set
Top 1%
0.7%
25
Communications Biology
886 papers in training set
Top 26%
0.7%
26
Proceedings of the National Academy of Sciences
2130 papers in training set
Top 46%
0.7%
27
ACS Synthetic Biology
256 papers in training set
Top 3%
0.7%
28
Clinical Chemistry
22 papers in training set
Top 1%
0.6%
29
BMC Genomics
328 papers in training set
Top 7%
0.6%