Back

Systematic annotation of Helitron-like elements in eukaryote genomes using HELIANO

Li, Z.; Gilbert, C.; Peng, H.; Pollet, N.

2024-02-09 evolutionary biology
10.1101/2024.02.08.579435 bioRxiv
Show abstract

Helitron-like elements (HLEs) are widespread eukaryotic DNA transposons employing a rolling-circle transposition mechanism. Despite their prevalence in fungi, animals, and plant genomes, identifying Helitrons remains challenging. We introduce HELIANO, a software for annotating and classifying autonomous and non-autonomous Helitron and Helentron sequences from whole genomes. HELIANO outperforms existing tools in speed and accuracy, demonstrated through benchmarking and its application to complex genomes (Xenopus tropicalis, Xenopus laevis, Oryza sativa), revealing numerous newly identified Helitrons and Helentrons. In a comprehensive analysis of 404 eukaryote genomes, we found HLEs widely distributed across phyla, with exceptions in specific taxa. Helentrons were identified in numerous land plant species, and 20 protein domains were discovered integrated within specific autonomous HLE families. A global phylogenetic analysis confirmed the classification into main clades Helentron and Helitron, revealing nine subgroups, some enriched in particular taxa. The future use of HELIANO will contribute to the global analysis of TEs across genomes and enhance our understanding of this transposon superfamily.

Matching journals

The top 1 journal accounts for 50% of the predicted probability mass.

1
Mobile DNA
27 papers in training set
Top 0.1%
63.0%
50% of probability mass above
2
Bioinformatics
1061 papers in training set
Top 3%
7.4%
3
Plant Communications
35 papers in training set
Top 0.5%
2.8%
4
Nucleic Acids Research
1128 papers in training set
Top 9%
1.9%
5
Genome Biology
555 papers in training set
Top 4%
1.9%
6
PLOS Genetics
756 papers in training set
Top 10%
1.4%
7
Genomics, Proteomics & Bioinformatics
171 papers in training set
Top 4%
1.4%
8
PLOS ONE
4510 papers in training set
Top 58%
1.4%
9
NAR Genomics and Bioinformatics
214 papers in training set
Top 3%
1.3%
10
Nature Communications
4913 papers in training set
Top 59%
0.9%
11
Frontiers in Plant Science
240 papers in training set
Top 5%
0.8%
12
Communications Biology
886 papers in training set
Top 22%
0.8%
13
Genes
126 papers in training set
Top 3%
0.7%
14
Peer Community Journal
254 papers in training set
Top 4%
0.7%
15
Scientific Reports
3102 papers in training set
Top 78%
0.7%
16
Advanced Science
249 papers in training set
Top 21%
0.7%
17
The Plant Journal
197 papers in training set
Top 3%
0.7%
18
Genome Biology and Evolution
280 papers in training set
Top 2%
0.7%
19
eLife
5422 papers in training set
Top 63%
0.5%
20
PLOS Computational Biology
1633 papers in training set
Top 28%
0.5%
21
Molecular Plant
36 papers in training set
Top 2%
0.5%
22
Bioinformatics Advances
184 papers in training set
Top 5%
0.5%
23
Frontiers in Genetics
197 papers in training set
Top 12%
0.5%