deCYPher: Star Allele-Resolution Computational Framework of Pharmacogenes for Haplotype-Resolved Long-Read Assemblies
Chang, T.-Y.; Liu, Y.-S.; Lai, H.-S.; Hung, T.-K.; Lin, H.-F.; Lin, Y.-H.; Hsu, C.-L.; Yang, Y.-C.; Chen, C.-Y.; Chen, P.-L.; Hsu, J. S.
Show abstract
Although existing next-generation sequencing (NGS) tools, such as Aldy and Cyrius, have been applied for allele typing, they cannot achieve complete accuracy due to various genomic challenges including pseudogenes, structural variations, hybrid genes, copy number variations, and gene deletions. These complexities make accurate pharmacogene interpretation more challenging, despite the crucial role pharmacogenomics plays in precision medicine. We developed deCYPher, a tool that generates personalized pharmacogenomic reports from haplotype-resolved assemblies. The tool enables analysis of all PharmVar 1A level genes, such as CYP2B6, CYP2C9, CYP2C19, CYP2D6, CYP3A5, CYP4F2, DPYD, NUDT15, and SLCO1B1. Applied to all HPRC haplotypes (including both release 1 and release 2 data), deCYPher demonstrated high accuracy in resolving complex gene structures. In the case of CYP2D6, release 1 identified 6% gene multiplications, 6% full gene deletions, and 4% CYP2D6/CYP2D7 hybrids. By contrast, release 2 demonstrated an increased prevalence of multiplications (14%) and hybrids (11%), while the frequency of full gene deletions remained comparable at 5%. Comparison with pb-StarPhase revealed discrepancies in 12 of 94 assemblies in the release 1 dataset. For instance, in sample HG02257, Aldy, Cyrius, and deCYPher consistently identified the genotype as *2/*35, whereas pb-StarPhase reported *2/*2. Notably, the *35-defining variants were present in the BAM and VCF files in the pb-StarPhase pipeline, but the local read depth over the *35-specific region was only 5x in HG02257-p, suggesting that the misclassification likely resulted from insufficient coverage - a known limitation of pb-StarPhase under low-depth conditions.
Matching journals
The top 3 journals account for 50% of the predicted probability mass.