HiFi sequencing accurately identifies clinically relevant variants in paralogous genes
van der Sanden, B.; Betz, C.; Herzog, K.; Schamschula, E.; Wimmer, K.; Vater, I.; Balachandran, S.; Chen, X.; Corominas Galbany, J.; Timmermans, R.; Derks, R.; HiFi Solves EMEA Consortium, ; Spielmann, M.; Eberle, M. A.; Gilissen, C.; Vissers, L. E. L. M.; Zschocke, J.; Bolz, H. J.; Hoischen, A.
Show abstract
Short-read sequencing (SRS) methods have improved the detection of small genetic variants but remain limited in highly homologous genomic regions, such as segmental duplications with gene-pseudogene pairs. These paralogous regions often require complex, locus-specific assays for accurate analysis. Long-read genome sequencing (lrGS) technologies, such as PacBio HiFi sequencing, can span these regions but still face challenges in variant calling due to alignment ambiguities. Here, we evaluated PacBio HiFi lrGS combined with Paraphase, a dedicated haplotype-based variant caller, in 86 individuals with 125 known clinically relevant variants across 11 paralogous loci. Standard HiFi variant callers detected 95/125 variants, while the remaining 30 variants were only identified by Paraphase. Together, the standard variant callers and Paraphase detected all known variants, including SNVs, InDels, CNVs, SVs, and gene conversions. In addition, lrGS allowed accurate phasing and gene-pseudogene copy number detection. We demonstrate that PacBio HiFi lrGS, particularly when integrated with Paraphase, enables comprehensive variant detection in previously difficult-to-assess genomic regions. These results also suggest that lrGS is ready for a wider implementation, possibly as a first-tier diagnostic approach for individuals with suspected variants in these paralogous regions.
Matching journals
The top 3 journals account for 50% of the predicted probability mass.