Back

Identification and characterization of retro-DNAs, a new type of retrotransposons originated from DNA transposons, in primate genomes

Tang, W.; Liang, P.

2020-03-20 evolutionary biology
10.1101/2020.03.19.999144 bioRxiv
Show abstract

Mobile elements (MEs) can be divided into two major classes based on their transposition mechanisms as retrotransposons and DNA transposons. DNA transposons move in the genomes directly in the form of DNA in a cut-and-paste style, while retrotransposons utilize an RNA-intermediate to transpose in a "copy-and-paste" fashion. In addition to the target site duplications (TSDs), a hallmark of transposition shared by both classes, the DNA transposons also carry terminal inverted repeats (TIRs). DNA transposons constitute ~3% of primate genomes and they are thought to be inactive in the recent primate genomes since ~37My ago despite their success during early primate evolution. Retrotransposons can be further divided into Long Terminal Repeat retrotransposons (LTRs), which are characterized by the presence of LTRs at the two ends, and non-LTRs, which lack LTRs. In the primate genomes, LTRs constitute ~9% of genomes and have a low level of ongoing activity, while non-LTR retrotransposons represent the major types of MEs, contributing to ~37% of the genomes with some members being very young and currently active in retrotransposition. The four known types of non-LTR retrotransposons include LINEs, SINEs, SVAs, and processed pseudogenes, all characterized by the presence of a polyA tail and TSDs, which mostly range from 8 to 15 bp in length. All non-LTR retrotransposons are known to utilize the L1-based target-primed reverse transcription (TPRT) machineries for retrotransposition. In this study, we report a new type of non-LTR retrotransposon, which we named as retro-DNAs, to represent DNA transposons by sequence but non-LTR retrotransposons by the transposition mechanism in the recent primate genomes. By using a bioinformatics comparative genomics approach, we identified a total of 1,750 retro-DNAs, which represent 748 unique insertion events in the human genome and nine non-human primate genomes from the ape and monkey groups. These retro-DNAs, mostly as fragments of full-length DNA transposons, carry no TIRs but longer TSDs with ~23.5% also carrying a polyA tail and with their insertion site motifs and TSD length pattern characteristic of non-LTR retrotransposons. These features suggest that these retro-DNAs are DNA transposon sequences likely mobilized by the TPRT mechanism. Further, at least 40% of these retro-DNAs locate to genic regions, presenting significant potentials for impacting gene function. More interestingly, some retro-DNAs, as well as their parent sites, show certain levels of current transcriptional expression, suggesting that they have the potential to create more retro-DNAs in the current primate genomes. The identification of retro-DNAs, despite small in number, reveals a new mechanism in propagating the DNA transposons sequences in the primate genomes with the absence of canonical DNA transposon activity. It also suggests that the L1 TPRT machinery may have the ability to retrotranspose a wider variety of DNA sequences than what we currently know.

Matching journals

The top 7 journals account for 50% of the predicted probability mass.

1
Mobile DNA
27 papers in training set
Top 0.1%
22.7%
2
Genes
126 papers in training set
Top 0.1%
8.5%
3
Frontiers in Genetics
197 papers in training set
Top 1%
4.9%
4
PLOS ONE
4510 papers in training set
Top 36%
4.0%
5
PLOS Genetics
756 papers in training set
Top 4%
4.0%
6
eLife
5422 papers in training set
Top 25%
3.6%
7
Genomics, Proteomics & Bioinformatics
171 papers in training set
Top 2%
2.9%
50% of probability mass above
8
Scientific Reports
3102 papers in training set
Top 43%
2.9%
9
Genome Biology and Evolution
280 papers in training set
Top 0.6%
2.8%
10
Nucleic Acids Research
1128 papers in training set
Top 8%
2.1%
11
Frontiers in Cell and Developmental Biology
218 papers in training set
Top 3%
2.1%
12
Nature Communications
4913 papers in training set
Top 48%
1.9%
13
The Plant Journal
197 papers in training set
Top 2%
1.5%
14
National Science Review
22 papers in training set
Top 1%
1.5%
15
Journal of Genetics and Genomics
36 papers in training set
Top 1%
1.5%
16
Philosophical Transactions of the Royal Society B
51 papers in training set
Top 3%
1.5%
17
Science China Life Sciences
26 papers in training set
Top 1%
1.3%
18
Molecular Biology and Evolution
488 papers in training set
Top 3%
1.3%
19
Viruses
318 papers in training set
Top 4%
1.2%
20
Genome Biology
555 papers in training set
Top 5%
1.2%
21
Science
429 papers in training set
Top 17%
1.0%
22
Journal of Virology
456 papers in training set
Top 3%
0.9%
23
Protein & Cell
25 papers in training set
Top 2%
0.9%
24
Nature Ecology & Evolution
113 papers in training set
Top 4%
0.8%
25
Communications Biology
886 papers in training set
Top 23%
0.8%
26
Proceedings of the National Academy of Sciences
2130 papers in training set
Top 44%
0.8%
27
iScience
1063 papers in training set
Top 32%
0.8%
28
Molecular Plant
36 papers in training set
Top 1%
0.8%
29
The CRISPR Journal
33 papers in training set
Top 0.3%
0.7%
30
Cell
370 papers in training set
Top 17%
0.7%