Back

Two telomere-to-telomere, gap-free genome assemblies and comparisons revealed the conserved key genes associated with sugar accumulation in Rubus genus

Li, X.; Han, X.; Liu, S.; Zhang, Q.; Guan, J.; Tang, Y.; Zhang, M.; Lian, H.; Xu, P.; Zheng, M.; Li, K.; Sun, G.; Sun, Y.; Dong, Y.; Lin, X.; Liang, Y.; Wang, Z.; Qin, G.; Li, B.; Zhou, H.; Yang, G.; Liu, Z.; He, H.; Zhou, J.

2025-04-10 genomics
10.1101/2025.04.09.646780 bioRxiv
Show abstract

For the first time, we assembled two highly continuous, completely gap-free reference genomes of the Rubus subgenus: Rubus hirsutus Thunb. Penglei (XMM) and Rubus eustephanos Focke ex Diels Dahongpao (DHP), which are widely distributed in southern China with similar phenotypic traits (Figures 1A, 1B, and Figure S1), yet ripe fruits display distinct sugar accumulation levels (Table S1), making them ideal candidates for investigating the mechanisms underlying sugar accumulation in the Rubus genus. The XMM (213.53 Mb, 28,204 genes) and DHP (218.26 Mb, 28,569 genes) genomes exhibit close evolutionary relationships, diverging approximately 3.21 Mya. Comparative genomics identified extensive synteny, interspecific structural variations (translocations, inversions, segmental duplications), and presence/absence variation (PAV). Using Hi-C interaction heatmaps and Sanger sequencing, we validated interspecific structural inversions. Additionally, we identified a sugar transporter gene (MFS1), which is present in XMM but absent in DHP. Combined analysis of the gene family expansion/contraction and transcriptome identified two conserved key genes (RhSTP13 and RhSTP7) associated with sugar accumulation in Rubus genus and displayed distinct roles through transient expression assay. To facilitate functional genomics study, we also established a comprehensive Rubus database, RubusDB, a freely accessible repository consolidating all genomic, transcriptomic and phenotypic data of Rubus genus. These findings provide a foundational framework for elucidating the genetic basis of sugar accumulation, genome diversification, and trait improvement in Rubus species. O_FIG O_LINKSMALLFIG WIDTH=200 HEIGHT=170 SRC="FIGDIR/small/646780v2_fig1.gif" ALT="Figure 1"> View larger version (72K): org.highwire.dtl.DTLVardef@1baca20org.highwire.dtl.DTLVardef@2c01eaorg.highwire.dtl.DTLVardef@131dc00org.highwire.dtl.DTLVardef@62a04f_HPS_FORMAT_FIGEXP M_FIG O_FLOATNOFIGURE 1.C_FLOATNO Phenotypic characterization, de novo genome assemblies, variation, and evolution of Rubus hirsutus Thunb. Penglei (XMM) and Rubus eustephanos Focke ex Diels Dahongpao (DHP). (A) Illustration of XMM whole plants, stem, flower bud and flower, respectively. (B) Illustration of DHP whole plants, stem, flower bud and flower, respectively. (C) The chromosome karyotype analysis of XMM. Bar = 10 {micro}m. (D) Circular diagram of XMM and DHP reference genomes. (a) Chromosomes are represented by centromeres (dark red) and telomeres (blue); (b) CEN18 density; (c) CEN17 density; (d) Gene density; (e) TE density; (f) PAV density; (g) SNP density; (h) Collinear lines at the center of diagram highlight homoeologous chromosomes relationships and non-homoeologous regions. (E) Genomic alignments between XMM and DHP. Inversions, duplications and translocations are marked with orange, blue and green ribbons, respectively. (F) Identification of large inversions in chromosome 7 between XMM and DHP. The three heatmaps show the chromatin interaction matrix, including mapping Hi-C data of DHP against DHP genome (left), mapping Hi-C data of DHP against XMM genome (middle) and mapping Hi-C data of XMM against XMM genome (right). The lower panel illustrates gene alignments between XMM and DHP. (G) Synteny, structural rearrangements, monomer, and identity distribution in the 16-20 Mbp region of chromosome 6 between XMM and DHP. (H) Local genome synteny of chromosome 5, with a structural variation that is present only in XMM. The PAV region includes a gene related to sugar transporter (MFS1, XMM05G025540). (I) Gene-level matchings of MFS genes in the PAVs region between XMM and DHP. (J) Phylogenetic relationships and divergence times between raspberries and other Rosaceae species. The black numbers close to the divergence nodes indicate the divergence times and the red bars represent the 95% confidence intervals. The red and black numbers indicate expanded and contracted ortholog groups at the corresponding node. Scale bar corresponds to 10 Mya. C_FIG

Matching journals

The top 5 journals account for 50% of the predicted probability mass.

1
Molecular Plant
36 papers in training set
Top 0.1%
18.7%
2
Nature Communications
4913 papers in training set
Top 20%
9.7%
3
The Plant Cell
141 papers in training set
Top 0.4%
8.1%
4
Plant Communications
35 papers in training set
Top 0.1%
7.9%
5
Nature Genetics
240 papers in training set
Top 1%
6.1%
50% of probability mass above
6
Nature Plants
84 papers in training set
Top 0.3%
6.1%
7
Horticulture Research
43 papers in training set
Top 0.4%
4.7%
8
The Plant Journal
197 papers in training set
Top 1%
4.7%
9
Genome Biology
555 papers in training set
Top 3%
3.4%
10
Journal of Genetics and Genomics
36 papers in training set
Top 0.4%
3.4%
11
Cell
370 papers in training set
Top 8%
2.6%
12
Science
429 papers in training set
Top 11%
2.3%
13
eLife
5422 papers in training set
Top 37%
2.0%
14
New Phytologist
309 papers in training set
Top 3%
2.0%
15
Nature
575 papers in training set
Top 10%
1.8%
16
Molecular Biology and Evolution
488 papers in training set
Top 3%
1.4%
17
Current Biology
596 papers in training set
Top 10%
1.4%
18
Genome Research
409 papers in training set
Top 3%
1.4%
19
Plant Biotechnology Journal
56 papers in training set
Top 0.8%
1.3%
20
Proceedings of the National Academy of Sciences
2130 papers in training set
Top 39%
1.1%
21
Genomics, Proteomics & Bioinformatics
171 papers in training set
Top 5%
0.9%
22
Cell Reports
1338 papers in training set
Top 34%
0.7%
23
Science Advances
1098 papers in training set
Top 32%
0.7%
24
PNAS Nexus
147 papers in training set
Top 3%
0.6%
25
Nucleic Acids Research
1128 papers in training set
Top 20%
0.6%
26
PLOS Genetics
756 papers in training set
Top 18%
0.6%