Back

Hide and seek: de novo identification in sugar beet reveals impact of non-autonomous LTR retrotransposons

Maiwald, S.; Maiwald, F.; Heitkam, T.

2026-03-03 genomics
10.64898/2026.03.01.708851 bioRxiv
Show abstract

Plant genomes are filled with retrotransposons and their derivatives, subject to constant sequence turnover. As short, non-autonomous retrotransposons do not encode a protein product, they experience reduced selective constraints on their DNA sequence, leading to diversification into multiple families, usually limited to only a few species. This absence of any coding capacity and their tendency to form subfamilies are the reasons for the incomplete description of non-autonomous LTR retrotransposons in most to all genomic repeat annotations. Here, we focus on non-autonomous LTR retrotransposon identification. Are all of these sequences derivatives of easier-to-identify full-length elements? Or is there more variability, which is currently overlooked? For this, we capitalize on our comprehensive understanding of the TE landscape in sugar beet to assess the extent of the blind spot on non-autonomous LTR retrotransposons Here, we present a workflow to identify non-autonomous LTR retrotransposons without prior sequence information, retrieving more than 100 families within the sugar beet genome. We only include TEs without the ability for complete self mobilization. Spanning up to 15,000 bp, these non-autonomous families are often longer than expected and characterized by reshuffling and modular evolution. Most strikingly, only a few of these families are directly derived from autonomous partners, showing that there is a large, undiscovered TE variety in the non-autonomous TE fraction. We highlight that a large fraction of non-autonomous TEs wont be retrieved with the current TE identification workflows, even if the output is well-curated and condensed into TE libraries and suggest procedures to remedy this gap. This study is the first insight into the non-autonomous LTR retrotransposon landscape within a single genome and serves as an example to estimate the error in non-autonomous TE detection.

Matching journals

The top 1 journal accounts for 50% of the predicted probability mass.

1
Mobile DNA
27 papers in training set
Top 0.1%
52.3%
50% of probability mass above
2
The Plant Cell
141 papers in training set
Top 0.6%
4.9%
3
Frontiers in Plant Science
240 papers in training set
Top 2%
3.7%
4
The Plant Journal
197 papers in training set
Top 1%
3.6%
5
Nucleic Acids Research
1128 papers in training set
Top 7%
2.7%
6
Nature Communications
4913 papers in training set
Top 48%
1.9%
7
PLOS Genetics
756 papers in training set
Top 8%
1.9%
8
Scientific Reports
3102 papers in training set
Top 58%
1.7%
9
PLOS ONE
4510 papers in training set
Top 57%
1.5%
10
Frontiers in Genetics
197 papers in training set
Top 6%
1.3%
11
Computational and Structural Biotechnology Journal
216 papers in training set
Top 6%
1.3%
12
Genome Biology and Evolution
280 papers in training set
Top 1%
1.2%
13
NAR Genomics and Bioinformatics
214 papers in training set
Top 3%
1.2%
14
Genome Biology
555 papers in training set
Top 6%
1.0%
15
Proceedings of the National Academy of Sciences
2130 papers in training set
Top 41%
0.9%
16
Cell Genomics
162 papers in training set
Top 5%
0.9%
17
eLife
5422 papers in training set
Top 53%
0.9%
18
Genes
126 papers in training set
Top 2%
0.9%
19
The Plant Genome
53 papers in training set
Top 0.6%
0.8%
20
Plant Communications
35 papers in training set
Top 1%
0.7%
21
Nature Plants
84 papers in training set
Top 2%
0.7%
22
Horticulture Research
43 papers in training set
Top 2%
0.7%
23
Molecular Biology and Evolution
488 papers in training set
Top 5%
0.6%
24
New Phytologist
309 papers in training set
Top 5%
0.6%
25
Molecular Plant
36 papers in training set
Top 2%
0.6%
26
Genetics
225 papers in training set
Top 5%
0.5%
27
PLOS Computational Biology
1633 papers in training set
Top 29%
0.5%