Back

A T2T-CHM13 recombination map and globally diverse haplotype reference panel improves phasing and imputation

Lalli, J. L.; Bortvin, A. N.; McCoy, R. C.; Werling, D. M.

2026-05-28 genomics
10.1101/2025.02.24.639687 bioRxiv
Show abstract

The T2T-CHM13 complete human reference genome contains [~]200 Mb of previously unresolved sequence, improving read mapping and variant calling compared to GRCh38. However, the benefits of using complete reference genomes for phasing and imputation are unclear. Here, we present a reference T2T-CHM13 recombination map and phased haplotype panel derived from 3,202 samples from the 1000 Genomes Project (1kGP). Using published long-read based assemblies as a reference-neutral ground truth, we compared our T2T-CHM13 1kGP panel to the previously released GRCh38 1kGP panel. We found that alignment to T2T-CHM13 resulted in 38% fewer assembly-discordant SNP genotypes and 16% fewer phasing switch errors. The largest gains in panel accuracy were observed on chromosome X and in the regions flanking loci prone to disease-causing CNVs. Moreover, downsampled genomes from the Simons Genome Diversity Project attained higher imputation accuracy when using the T2T-CHM13 versus GRCh38 1kGP panel. Our study demonstrates that use of the T2T-CHM13 phased haplotype panel improves statistical phasing and imputation for samples from diverse human populations.

Matching journals

The top 4 journals account for 50% of the predicted probability mass.

1
The American Journal of Human Genetics
206 papers in training set
Top 0.2%
18.5%
2
Genome Medicine
154 papers in training set
Top 0.3%
12.2%
3
Nature Communications
4913 papers in training set
Top 15%
12.2%
4
Nature Genetics
240 papers in training set
Top 0.9%
8.3%
50% of probability mass above
5
Genome Biology
555 papers in training set
Top 2%
4.8%
6
Nucleic Acids Research
1128 papers in training set
Top 4%
4.3%
7
Genome Research
409 papers in training set
Top 0.8%
3.9%
8
Scientific Reports
3102 papers in training set
Top 40%
3.2%
9
Cell Genomics
162 papers in training set
Top 2%
2.7%
10
Frontiers in Genetics
197 papers in training set
Top 3%
2.1%
11
Communications Biology
886 papers in training set
Top 7%
1.8%
12
PLOS Genetics
756 papers in training set
Top 8%
1.8%
13
Science
429 papers in training set
Top 15%
1.7%
14
NAR Genomics and Bioinformatics
214 papers in training set
Top 2%
1.5%
15
Bioinformatics
1061 papers in training set
Top 8%
1.3%
16
Nature Computational Science
50 papers in training set
Top 1%
0.9%
17
Bioinformatics Advances
184 papers in training set
Top 4%
0.9%
18
Nature Human Behaviour
85 papers in training set
Top 4%
0.9%
19
Genomics, Proteomics & Bioinformatics
171 papers in training set
Top 5%
0.9%
20
European Journal of Human Genetics
49 papers in training set
Top 1%
0.9%
21
Human Genetics and Genomics Advances
70 papers in training set
Top 0.8%
0.7%
22
BMC Genomics
328 papers in training set
Top 6%
0.7%
23
Briefings in Bioinformatics
326 papers in training set
Top 7%
0.7%
24
Nature
575 papers in training set
Top 16%
0.7%
25
PLOS Computational Biology
1633 papers in training set
Top 25%
0.7%
26
Nature Methods
336 papers in training set
Top 6%
0.7%
27
Proceedings of the National Academy of Sciences
2130 papers in training set
Top 47%
0.7%
28
PLOS ONE
4510 papers in training set
Top 69%
0.7%