A T2T-CHM13 recombination map and globally diverse haplotype reference panel improves phasing and imputation
Lalli, J. L.; Bortvin, A. N.; McCoy, R. C.; Werling, D. M.
Show abstract
The T2T-CHM13 complete human reference genome contains [~]200 Mb of previously unresolved sequence, improving read mapping and variant calling compared to GRCh38. However, the benefits of using complete reference genomes for phasing and imputation are unclear. Here, we present a reference T2T-CHM13 recombination map and phased haplotype panel derived from 3,202 samples from the 1000 Genomes Project (1kGP). Using published long-read based assemblies as a reference-neutral ground truth, we compared our T2T-CHM13 1kGP panel to the previously released GRCh38 1kGP panel. We found that alignment to T2T-CHM13 resulted in 38% fewer assembly-discordant SNP genotypes and 16% fewer phasing switch errors. The largest gains in panel accuracy were observed on chromosome X and in the regions flanking loci prone to disease-causing CNVs. Moreover, downsampled genomes from the Simons Genome Diversity Project attained higher imputation accuracy when using the T2T-CHM13 versus GRCh38 1kGP panel. Our study demonstrates that use of the T2T-CHM13 phased haplotype panel improves statistical phasing and imputation for samples from diverse human populations.
Matching journals
The top 4 journals account for 50% of the predicted probability mass.