Back

The diploid reference genome of a human embryonic stem cell line

Pacar, I.; Ungaro, M. T.; Chen, Y.; Dallali, H.; Medico, J. A.; Hebbar, P.; Diekhaus, M.; Di Tommaso, E.; Geleta, M.; Chan, P. P.; Lowe, T. M.; Balacco, J.; Jain, N.; Ackerman, F.; Mochi, M.; Ioannidis, A. G.; Sawarkar, N.; Diaz, K.; Krishna Sudhakar, K.; Powell, J. E.; Jain, M.; Rosa, A.; Croft, G. F.; Tanzer, A.; Jarvis, E. D.; Formenti, G.; Salama, S. R.; Giunta, S.

2026-03-30 genomics
10.64898/2026.03.26.714432 bioRxiv
Show abstract

Advances in DNA sequencing and assembly technologies are spurring a shift from haploid reference genomes to sample-specific diploid assemblies. Here, we generated the first telomere-to-telomere (T2T) diploid reference for the widely used human embryonic stem cell (hESC) line, H9 (WAe009-A). This haplotype-resolved assembly is highly accurate with comprehensive annotation of genes, segmental duplications, methylation, and chromatin conformation. Pangenomic and phased-locus inference point to H9s mixed ancestry with a predominant European component. H9-specific genomic features include near-perfect telomeres [~]1.65-fold longer than other T2T assemblies, consistent with telomerase activity during pluripotency; chromosome 17 inversions that can predispose offspring to neurological syndromes; and expansions of ncRNA clusters, with overall genomic stability maintained despite extensive culturing. Mapping multi-omic datasets to the genome, we demonstrate the power of this resource for allele-specific, high-precision transcriptomic, genetic, and epigenetic analyses, with far-reaching implications for human development and disease.

Matching journals

The top 3 journals account for 50% of the predicted probability mass.