Back

T2T Pangenome Reveals a 3.3kb Structural Variation Driving the De Novo Evolution of a Subspecies-Specific NLR Gene in Rice

fan, j.

2026-02-24 genetics
10.64898/2026.02.21.705258 bioRxiv
Show abstract

BackgroundThe genomic region spanning 1.1-1.3 Mb on rice chromosome 6 is a recognized structural variation (SV) hotspot linked to Rice Black-Streaked Dwarf Virus (RBSDV) resistance. However, the precise molecular mechanism has remained elusive due to the inherent "reference bias" of the japonica-based genome, which lacks the critical causative sequences. MethodsLeveraging a neuro-symbolic-driven analysis of gap-free Telomere-to-Telomere (T2T) pangenome datasets and the LGEMP engine, we conducted a high-resolution comparative study between indica (9311) and japonica (Nippon bare). This approach allowed us to treat genomic variations as 3D structural building blocks rather than linear strings. ResultsWe identified a 3.3 kb large-scale insertion uniquely present at the 1.21 Mb locus in 9311. This SV, likely mediated by transposable elements, exhibits extreme sequence divergence (24% identity). We demonstrate that this insertion acts as a topological modifier, driving a dramatic functional shift: while the japonica allele encodes a basic DUF590 transporter, the indica allele has undergone de novo evolution into a complete CC-NBS-LRR (NLR) immune receptor. Transcriptomic profiling confirmed the generation of six novel isoforms (T01-T06) enabled by the SVs structural re-organization. Validation across 16 representative T2T assemblies confirms this 3.3 kb SV as an indica-specific "evolutionary patch," effectively filling the "missing heritability" gap in rice viral immunity. ConclusionOur findings uncover a novel mechanism of gene birth through structural re-organization at high-diversity hotspots. By integrating T2T pangenomics with AI-driven inference, this study provides a definitive molecular marker for the precision breeding of virus-resistant crops and redefines our understanding of subspecies-specific adaptation..

Matching journals

The top 8 journals account for 50% of the predicted probability mass.

1
Plant Biotechnology Journal
56 papers in training set
Top 0.1%
14.1%
2
Plant Communications
35 papers in training set
Top 0.1%
8.3%
3
Nature Communications
4913 papers in training set
Top 25%
7.0%
4
Cell Genomics
162 papers in training set
Top 0.6%
6.2%
5
Nucleic Acids Research
1128 papers in training set
Top 4%
4.8%
6
PLOS Genetics
756 papers in training set
Top 5%
3.5%
7
Horticulture Research
43 papers in training set
Top 0.6%
3.5%
8
New Phytologist
309 papers in training set
Top 2%
3.0%
50% of probability mass above
9
Frontiers in Plant Science
240 papers in training set
Top 3%
2.8%
10
Nature Genetics
240 papers in training set
Top 3%
2.6%
11
Proceedings of the National Academy of Sciences
2130 papers in training set
Top 25%
2.6%
12
Genome Biology
555 papers in training set
Top 3%
2.6%
13
Frontiers in Genetics
197 papers in training set
Top 3%
2.3%
14
The Plant Journal
197 papers in training set
Top 2%
2.0%
15
Molecular Plant
36 papers in training set
Top 0.6%
2.0%
16
BMC Biology
248 papers in training set
Top 0.9%
1.9%
17
Journal of Genetics and Genomics
36 papers in training set
Top 0.9%
1.8%
18
Communications Biology
886 papers in training set
Top 7%
1.8%
19
Genome Medicine
154 papers in training set
Top 5%
1.5%
20
Virus Evolution
140 papers in training set
Top 0.9%
1.5%
21
The Plant Cell
141 papers in training set
Top 1%
1.5%
22
Genomics, Proteomics & Bioinformatics
171 papers in training set
Top 4%
1.3%
23
Plant Physiology
217 papers in training set
Top 2%
1.1%
24
Molecular Biology and Evolution
488 papers in training set
Top 4%
0.9%
25
Scientific Reports
3102 papers in training set
Top 71%
0.9%
26
Advanced Science
249 papers in training set
Top 18%
0.8%
27
The Plant Genome
53 papers in training set
Top 0.7%
0.7%
28
Genetics
225 papers in training set
Top 4%
0.7%
29
BMC Genomics
328 papers in training set
Top 6%
0.7%
30
Cell Reports
1338 papers in training set
Top 34%
0.7%