Back

Benchmarking full-length ITS metabarcoding across Illumina 2x500, PacBio, and Oxford Nanopore sequencing using mock and soil communities

Tedersoo, L.; Prous, M.; Chen, M.; Anslan, S.; Saar, I.; Dubois, B.; Mikryukov, V.

2026-05-21 bioinformatics
10.64898/2026.05.20.726443 bioRxiv
Show abstract

Metabarcoding is a powerful tool for biodiversity comparisons, where standard-size DNA barcodes (>500 bases) offer better taxonomic resolution than shorter ones. Still, the choice of sequencing platforms and bioinformatics pipelines may strongly affect inferred diversity due to various technical biases. We assessed the relative performance of Illumina MiSeq i100 (2x500 paired-end), PacBio Revio and Oxford Nanopore MinION sequencing and bioinformatics pipelines, using full-length ITS amplicon sequencing datasets from a 103-species mock community and 45 composite soil samples. Despite numerous low-quality reads, PacBio yielded the lowest overall error rate and highest number of taxa. Illumina revealed the highest proportion of chimeric and index-switched reads, along with a strong bias towards shorter amplicons. MinION data analysed using PRONAME and Minovar - a bioinformatics pipeline presented here - had the largest proportion of low-quality data, and rare taxa were lost during data filtering and read polishing steps. Although Minovar enabled amplicon sequence variant (ASV) level precision for common taxa, we recommend clustering ASVs into OTUs. For PacBio, standard filtering approaches outperformed the ASV approach because they retained rare taxa. For Illumina, a stringent ASV approach or removal of rare OTUs would limit artefacts. Across all platforms, excess PCR cycles promoted chimeric and low-quality reads and lost quantitativity in biodiversity assessments. With moderate differences in effect sizes, all analytical approaches supported the conclusion that sampling design determines how we see soil biodiversity responses to land use. For biodiversity surveys based on the full-length ITS metabarcoding, we recommend using PacBio sequencing with standard, non-ASV pipelines.

Matching journals

The top 9 journals account for 50% of the predicted probability mass.

1
PLOS ONE
4510 papers in training set
Top 20%
9.1%
2
Scientific Reports
3102 papers in training set
Top 10%
8.4%
3
Frontiers in Microbiology
375 papers in training set
Top 1%
6.3%
4
PeerJ
261 papers in training set
Top 0.7%
6.3%
5
Environmental Microbiome
26 papers in training set
Top 0.1%
6.3%
6
Ecological Indicators
20 papers in training set
Top 0.1%
4.1%
7
Nature Communications
4913 papers in training set
Top 39%
3.6%
8
Peer Community Journal
254 papers in training set
Top 0.8%
3.6%
9
mSystems
361 papers in training set
Top 3%
3.6%
50% of probability mass above
10
Methods in Ecology and Evolution
160 papers in training set
Top 0.9%
3.6%
11
Molecular Ecology Resources
161 papers in training set
Top 0.4%
3.2%
12
Environmental DNA
49 papers in training set
Top 0.1%
3.1%
13
Metabarcoding and Metagenomics
12 papers in training set
Top 0.1%
2.1%
14
Frontiers in Plant Science
240 papers in training set
Top 3%
1.9%
15
Microbiome
139 papers in training set
Top 2%
1.7%
16
FEMS Microbiology Ecology
47 papers in training set
Top 0.2%
1.7%
17
Science of The Total Environment
179 papers in training set
Top 3%
1.5%
18
eLife
5422 papers in training set
Top 45%
1.5%
19
Ecology and Evolution
232 papers in training set
Top 3%
1.3%
20
New Phytologist
309 papers in training set
Top 4%
0.9%
21
Computational and Structural Biotechnology Journal
216 papers in training set
Top 7%
0.9%
22
Agriculture, Ecosystems & Environment
15 papers in training set
Top 0.2%
0.9%
23
Microbiology Spectrum
435 papers in training set
Top 4%
0.9%
24
Microorganisms
101 papers in training set
Top 2%
0.9%
25
Genome Biology
555 papers in training set
Top 7%
0.8%
26
GigaScience
172 papers in training set
Top 3%
0.7%
27
PLOS Computational Biology
1633 papers in training set
Top 26%
0.7%
28
Journal of Environmental Management
11 papers in training set
Top 0.9%
0.7%
29
Molecular Ecology
304 papers in training set
Top 4%
0.7%
30
Environmental Microbiology
119 papers in training set
Top 3%
0.7%