LoFi drafts to map to: 4 haplotype-resolved Cannabis genomes enable characterization of large structural variants
Pike, B.; Goncalves da Silva, A.; Teran, W.
Show abstract
We present fully-phased, chromosome-scale genome assemblies of 4 genotypes of Cannabis sativa. These assemblies were built from Oxford Nanopore R9.4.1 long reads, which previously have been considered insufficiently accurate for proper phasing. Contigs produced by the Phased Error Correction and Assembly Tool (PECAT), in combination with Hi-C libraries, were used by GreenHill to develop intermediate data structures that permit accurate phasing of the dual contigs, which were then scaffolded by the advanced algorithm of Yet another Hi-C Scaffolder (YaHS). These assemblies, while low in QV, are comparable to recent HiFi assemblies in their contiguity and gene content, and also show good macrosynteny with them. We compare these 8 haplotypes with 77 others recently produced and present a phylogenetic analysis, as well as a first draft of the Cannabis pan-NLRome. CoreWe assembled four fully-phased and chromosome-scale diploid genomes of Cannabis sativa, using Oxford Nanopore Technology readsets. These new assemblies are comparable to recent PacBio HiFi assemblies in terms of contiguity and gene content. We present a phylogenomic analysis, using whole-genome alignments after including 77 other publicly available Cannabis genomes, as well as a draft pan-NLRome. Gene and Accession NumbersAssemblies are archived at NCBI as BioProjects PRJNA1301983 (ANC), PRJNA1301963 (HAW), PRJNA1301984 (SRI), and PRJNA1301985 (TRC). Assemblies, annotations, and Supplemental Tables are also available on Zenodo: https://doi.org/10.5281/zenodo.16456638.
Matching journals
The top 8 journals account for 50% of the predicted probability mass.