Benchmarking sequence performance on the DNBSEQ-T7 using Genome in a Bottle reference genomes
van Coller, A.; Taukobong, S.; Malima, M.; Ghoor, S.; Nangammbi, N.; Roode, E.; Naicker, M.; Cole, V.; Glanzmann, B.; Kinnear, C.; Carstens, N.
Show abstract
Advances in sequencing technologies have improved the accuracy, throughput, and completeness of human genome characterization, enabling more reliable detection of genetic variation. Well-characterized reference genomes are critical for benchmarking sequencing platforms and bioinformatics analysis pipelines. Here, we present whole genome sequencing datasets generated for the Ashkenazi Jewish trio reference samples from the Genome in a Bottle Consortium. Libraries were prepared using three distinct MGI-based workflows: PCR-free library preparation, FastFS DNA library preparation, and Universal DNA library preparation. Sequencing was performed on the MGI DNBSEQ-T7 platform, generating a minimum of 400 million paired-end reads per sample, corresponding to 30X mean genome coverage. Raw reads were processed using a standardized GATK bioinformatics workflow. Sequencing performance and variant detection accuracy were evaluated using the Genome in a Bottle high-confidence benchmark variant sets. All workflows demonstrated high sequencing quality and concordance with GIAB benchmark truth sets, with PCR-free libraries showing the strongest indel calling performance and lowest Mendelian violation rates across the Ashkenazi trio. This dataset provides a resource for benchmarking DNBSEQ-T7 sequencing and bioinformatics workflows, and for evaluating the impact of library preparation strategies on whole genome variant detection performance.
Matching journals
The top 9 journals account for 50% of the predicted probability mass.