Pangenome-based identification of cryptic pathogenic variants in undiagnosed rare disease patients
Jang, S. S.; Kim, S.; Lee, S.; Kim, S. Y.; Moon, J.; Kim, J.; Chae, J.-H.
Show abstract
BackgroundDespite widespread implementation of exome and genome sequencing, a substantial proportion of rare disease patients remain undiagnosed due to inherent limitations in detecting structural, repetitive, and regulatory variants. MethodsWe applied long-read sequencing (LRS) to 40 individuals from 33 previously undiagnosed Korean families. De novo assemblies were integrated into a graph-based pangenome workflow, enabling sensitive detection of single-nucleotide, structural, and tandem-repeat variants and direct profiling of CpG methylation. ResultsPathogenic or likely pathogenic variants were identified in 9 (27.3%) families that had remained unsolved despite prior short-read sequencing. The discoveries comprised deep intronic splice-altering SNVs, non-coding regulatory deletions, complex rearrangements, large deletions, tandem repeat expansions, and aberrant methylation profiles. We also implicate CXXC1 as a novel disease-associated gene, potentially contributing to a global DNA methylation defects, and revealed novel pathogenic variants in established disease genes such as HEXB and NGLY1, providing insights into underrecognized genetic contributors to rare diseases. ConclusionsLRS coupled with pangenome-based, graph-driven analysis closed a sizable diagnostic gap, broadened the mutational spectra of several Mendelian genes and brought epigenomic evidence into rare disease investigation. These findings support the adoption of long-read, graph-based workflows as a front-line strategy for comprehensive genomic and epigenomic diagnosis.
Matching journals
The top 2 journals account for 50% of the predicted probability mass.