Back

The haplotype-resolved chromosome pairs and transcriptome of a heterozygous diploid African cassava cultivar

Qi, W.; Lim, Y.-W.; Patrignani, A.; Schlaepfer, P.; Bratus-Neuenschwander, A.; Grueter, S.; Chanez, C.; Rodde, N.; Prat, E.; Vautrin, S.; Fustier, M.-A.; Pratas, D.; Schlapbach, R.; Gruissem, W.

2021-11-19 genomics
10.1101/2021.11.16.468774 bioRxiv
Show abstract

BackgroundCassava (Manihot esculenta) is an important clonally propagated food crop in tropical and sub-tropical regions worldwide. Genetic gain by molecular breeding is limited because cassava has a highly heterozygous, repetitive and difficult to assemble genome. FindingsHere we demonstrate that Pacific Biosciences high-fidelity (HiFi) sequencing reads, in combination with the assembler hifiasm, produced genome assemblies at near complete haplotype resolution with higher continuity and accuracy compared to conventional long sequencing reads. We present two chromosome scale haploid genomes phased with Hi-C technology for the diploid African cassava variety TME204. Genome comparisons revealed extensive chromosome re-arrangements and abundant intra-genomic and inter-genomic divergent sequences despite high gene synteny, with most large structural variations being LTR-retrotransposon related. Allele-specific expression analysis of different tissues based on the haplotype-resolved transcriptome identified both stable and inconsistent alleles with imbalanced expression patterns, while most alleles expressed coordinately. Among tissue-specific differentially expressed transcripts, coordinately and biasedly regulated transcripts were functionally enriched for different biological processes. We use the reference-quality assemblies to build a cassava pan-genome and demonstrate its importance in representing the genetic diversity of cassava for downstream reference-guided omics analysis and breeding. ConclusionsThe haplotype-resolved genome allows the first systematic view of the heterozygous diploid genome organization in cassava. The completely phased and annotated chromosome pairs will be a valuable resource for cassava breeding and research. Our study may also provide insights into developing cost-effective and efficient strategies for resolving complex genomes with high resolution, accuracy and continuity.

Matching journals

The top 4 journals account for 50% of the predicted probability mass.

1
The Plant Genome
53 papers in training set
Top 0.1%
22.6%
2
Plant Biotechnology Journal
56 papers in training set
Top 0.1%
12.6%
3
Frontiers in Plant Science
240 papers in training set
Top 0.8%
10.5%
4
The Plant Journal
197 papers in training set
Top 0.6%
7.2%
50% of probability mass above
5
DNA Research
23 papers in training set
Top 0.1%
6.8%
6
Gigabyte
60 papers in training set
Top 0.2%
4.9%
7
Horticulture Research
43 papers in training set
Top 0.5%
3.6%
8
Frontiers in Genetics
197 papers in training set
Top 2%
3.6%
9
Scientific Reports
3102 papers in training set
Top 50%
2.1%
10
PLOS ONE
4510 papers in training set
Top 53%
1.7%
11
Plant Direct
81 papers in training set
Top 1%
1.7%
12
Plant Communications
35 papers in training set
Top 0.9%
1.5%
13
BMC Genomics
328 papers in training set
Top 3%
1.5%
14
G3 Genes|Genomes|Genetics
351 papers in training set
Top 2%
1.3%
15
New Phytologist
309 papers in training set
Top 4%
1.3%
16
G3: Genes, Genomes, Genetics
222 papers in training set
Top 0.6%
1.1%
17
Genes
126 papers in training set
Top 2%
1.0%
18
GigaScience
172 papers in training set
Top 2%
0.9%
19
Microbial Genomics
204 papers in training set
Top 2%
0.8%
20
Molecular Ecology Resources
161 papers in training set
Top 1%
0.7%
21
Genome Biology
555 papers in training set
Top 7%
0.7%
22
G3
33 papers in training set
Top 0.6%
0.6%
23
Genomics
60 papers in training set
Top 3%
0.6%
24
Computational and Structural Biotechnology Journal
216 papers in training set
Top 12%
0.5%
25
Bioinformatics
1061 papers in training set
Top 11%
0.5%
26
Plant Methods
39 papers in training set
Top 1.0%
0.5%