Aviti Sequencing and Marker Gene Data Analysis

Gould, T. J.; Taylor, M.; Santelli, C.

2026-02-09 bioinformatics

10.64898/2026.02.06.704475 bioRxiv

Show abstract

Accurate identification of microbial species in complex populations and communities relies on the isolation of representative marker 16S, ITS, and 18S sequences through the use of DNA extraction, PCR, and sequencing. Aviti sequencing has brought an improvement in the read quality and depth of marker gene sequencing technology. Quality scores exceeding Q40 representing highly accurate sequencing allows researchers to ask more questions of their marker gene data. However, this improvement in quality and throughput also brings with it a surprising increase in diversity of amplicon sequencing variants (ASVs) making further analysis and comparisons to previous studies on Illumina platforms challenging. This increased diversity causes downstream processing issues, including an over-reporting of chimeric ASVs. Here we identify this problem and put forward straightforward solutions to retain counts and reduce technically introduced diversity, as well as tying chimeric read identification to minimum parent distance. Through the use of synthetic mock samples, we discovered that erroneous ASVs are systematically substitution errors introduced by the upstream PCR methods. This error can be reduced significantly bioinformatically through clustering of ASVs within 99% similarity. Further we highlight technically introduced variation as a result of variable region length, sample misassignment, and sample biomass. Collectively, these results improve the similarity of Aviti and Illumina datasets for better comparisons of microbial studies from different platforms.

Aviti Sequencing and Marker Gene Data Analysis

Matching journals