Back

Haplotype-resolved genome of autotetraploid alfalfa (Medicago sativa) Regen-SY27x uncovers large scale structural variation and resistance gene dynamics

Kaur, H.; Cameron, C. T.; Gomez, A.; Mudge, J.; Farmer, A.; Shannon, L. M.; Samac, D. A.

2026-05-05 genomics
10.64898/2026.05.01.722254 bioRxiv
Show abstract

Polyploid genome assembly presents unique challenges due to extensive heterozygosity and complex haplotype structure. We report a haplotype-resolved, chromosome-scale assembly of Regen-SY27x, a genotype of autotetraploid alfalfa (Medicago sativa), which is widely used for genetic modification because of its excellent regenerative capacity in tissue culture. Using PacBio HiFi long reads, Omni-C scaffolding, and linkage map guided phasing, we generated a 3.2 GB assembly comprising four haplotypes with high contiguity and completeness. Kmer-based validation confirmed accurate haplotype separation, while linkage map integration and dotplot analysis identified and corrected chimeric scaffolds. Gene annotation yielded 221,688 protein-coding genes, with more than 99% assigned to pseudochromosomes. Repetitive elements accounted for 62.7% of the genome, dominated by long terminal repeat retrotransposons and a high fraction of Helitrons. The spatial enrichment of Helitrons within gene-dense distal chromosome arms underscores their pivotal role as key drivers of genomic innovation and gene family expansion. We identified 3,696 nucleotide-binding leucine-rich repeat R genes, with Toll/interleukin-1 receptor-like and Rx-type subclasses forming large tandem clusters across haplotypes. Comparative analyses revealed strong macrosyntenic conservation among Regen-SY27x and the publicly available Chinese alfalfa genomes but extensive structural variation both within Regen-SY27x haplotypes and between Regen-SY27x and the Chinese genotypes with tens of thousands of duplications, inversions, and translocations detected. These results demonstrate that a single autotetraploid individual captures extensive structural diversity, but individuals from different populations vary greatly. The Regen-SY27x assembly provides a foundational genomic resource for investigating polyploid genome evolution and identifying genetic variation relevant to biological and agronomic improvement in alfalfa. Article SummaryThis study presents the first chromosome-scale, haplotype-resolved genome assembly of the US alfalfa germplasm, Regen-SY27x, a key alfalfa genotype used widely for genetic engineering. We integrated HiFi long reads, Omni-CTM scaffolding, and linkage map-guided phasing to reconstruct all four haplotypes of this complex autotetraploid. Our results identified 221,688 protein-coding genes and reveal immense intra-individual structural variations dominated by small duplications. This high-quality reference serves as a foundational tool for the alfalfa community, enabling researchers to link complex structural diversity with agronomic traits and further enhance the biotechnological potential of this essential forage crop.

Matching journals

The top 6 journals account for 50% of the predicted probability mass.

1
Plant Biotechnology Journal
56 papers in training set
Top 0.1%
17.8%
2
Plant Communications
35 papers in training set
Top 0.1%
9.7%
3
The Plant Journal
197 papers in training set
Top 0.6%
7.9%
4
Molecular Plant
36 papers in training set
Top 0.2%
6.5%
5
Nature Communications
4913 papers in training set
Top 28%
6.5%
6
Horticulture Research
43 papers in training set
Top 0.4%
6.1%
50% of probability mass above
7
Genome Biology
555 papers in training set
Top 2%
4.1%
8
The Plant Cell
141 papers in training set
Top 0.8%
4.1%
9
Cell Genomics
162 papers in training set
Top 2%
3.4%
10
The Plant Genome
53 papers in training set
Top 0.3%
2.6%
11
Nature Genetics
240 papers in training set
Top 3%
2.3%
12
Genomics, Proteomics & Bioinformatics
171 papers in training set
Top 3%
2.0%
13
Frontiers in Plant Science
240 papers in training set
Top 3%
1.8%
14
Nature Plants
84 papers in training set
Top 1.0%
1.8%
15
Cell
370 papers in training set
Top 12%
1.6%
16
DNA Research
23 papers in training set
Top 0.3%
1.4%
17
PLOS Genetics
756 papers in training set
Top 11%
1.3%
18
Frontiers in Genetics
197 papers in training set
Top 6%
1.3%
19
Computational and Structural Biotechnology Journal
216 papers in training set
Top 6%
1.2%
20
Journal of Genetics and Genomics
36 papers in training set
Top 1%
1.2%
21
Proceedings of the National Academy of Sciences
2130 papers in training set
Top 40%
1.1%
22
GigaScience
172 papers in training set
Top 3%
0.8%
23
G3 Genes|Genomes|Genetics
351 papers in training set
Top 2%
0.8%
24
New Phytologist
309 papers in training set
Top 5%
0.7%
25
Science
429 papers in training set
Top 21%
0.7%
26
PLOS ONE
4510 papers in training set
Top 70%
0.7%
27
Communications Biology
886 papers in training set
Top 28%
0.7%
28
Plant Direct
81 papers in training set
Top 2%
0.6%
29
Gigabyte
60 papers in training set
Top 2%
0.6%