Back

SINE Retrotransposons Import Polyadenylation Signals to 3'UTRs in Dog (Canis familiaris)

Choi, J. D.; Del Pinto, L. A.; Sutter, N. B.

2020-12-01 genomics
10.1101/2020.11.30.405357 bioRxiv
Show abstract

BackgroundMessenger RNA 3 untranslated regions (3UTRs) control many aspects of gene expression and determine where the transcript will terminate. The polyadenylation signal (PAS) AAUAAA is a key regulator of transcript termination and this hexamer, or a similar sequence, is very frequently found within 30 bp of 3UTR ends. Short interspersed element (SINE) retrotransposons are found throughout genomes in high copy number. When inserted into genes they can disrupt expression, alter splicing, or cause nuclear retention of mRNAs. The genomes of the domestic dog and other carnivores carry hundreds of thousands Can-SINEs, a tRNA-related SINE with transcription termination potential. Because of this we asked whether Can-SINEs may help terminate transcript in some dog genes. ResultsDog 3UTRs have several peaks of AATAAA PAS frequency within 40 bp of the 3UTR end, including four bp-interval peaks at 28, 32, and 36 bp from the end. The periodicity is partly explained by TAAA(n) repeats within Can-SINE AT-rich tails. While density of antisense-oriented Can-SINEs in 3UTRs is fairly constant with distances from 3end, sense-oriented Can-SINEs are common at the 3end but nearly absent farther upstream. There are nine Can-SINE sub-types in the dog genome and the consensus sequence sense strands (head to tail) all carry at least three PASs while antisense strands usually have none. We annotated all repeat-masked Can-SINE copies in the Boxer reference genome and found that the young SINEC_Cf type has a mode of 15 bp for target site duplications (TSDs). We find that all Can-SINE types favor integration at TSDs beginning with A(4). The count of AATAAA PASs differs significantly between sense and antisense-oriented retrotransposons in transcripts. Can-SINEs near 3UTR ends are very likely to carry AATAAA on the mRNA sense strand while those farther upstream are not. We also identified loci where Can-SINE insertion has truncated or altered a dog 3UTR compared to the human ortholog. ConclusionDog Can-SINE activity has imported AATAAA PASs into gene transcripts and led to alteration of 3UTRs. AATAAA sequences are selectively removed from Can-SINEs in introns and upstream 3UTR regions but are retained at the far downstream end of 3UTRs, which we infer reflects their role as termination sequences for these transcripts.

Matching journals

The top 4 journals account for 50% of the predicted probability mass.

1
Frontiers in Genetics
197 papers in training set
Top 0.1%
22.9%
2
Genes
126 papers in training set
Top 0.1%
10.3%
3
PLOS ONE
4510 papers in training set
Top 20%
9.3%
4
PLOS Genetics
756 papers in training set
Top 1%
8.6%
50% of probability mass above
5
Gene
41 papers in training set
Top 0.1%
4.9%
6
BMC Genomics
328 papers in training set
Top 0.6%
4.0%
7
Genomics
60 papers in training set
Top 0.3%
3.7%
8
Mobile DNA
27 papers in training set
Top 0.1%
3.7%
9
PeerJ
261 papers in training set
Top 5%
2.1%
10
Genome Biology and Evolution
280 papers in training set
Top 0.8%
1.9%
11
Frontiers in Microbiology
375 papers in training set
Top 5%
1.7%
12
Mitochondrion
11 papers in training set
Top 0.1%
1.7%
13
F1000Research
79 papers in training set
Top 2%
1.5%
14
Nucleic Acids Research
1128 papers in training set
Top 12%
1.5%
15
Genome Biology
555 papers in training set
Top 6%
1.0%
16
Genomics, Proteomics & Bioinformatics
171 papers in training set
Top 5%
0.8%
17
International Journal of Molecular Sciences
453 papers in training set
Top 14%
0.8%
18
RNA Biology
70 papers in training set
Top 0.5%
0.8%
19
Frontiers in Immunology
586 papers in training set
Top 7%
0.8%
20
BMC Bioinformatics
383 papers in training set
Top 7%
0.8%
21
Bioinformatics Advances
184 papers in training set
Top 5%
0.8%
22
Viruses
318 papers in training set
Top 5%
0.8%
23
G3 Genes|Genomes|Genetics
351 papers in training set
Top 3%
0.7%
24
Computational and Structural Biotechnology Journal
216 papers in training set
Top 10%
0.7%
25
International Journal of Infectious Diseases
126 papers in training set
Top 4%
0.7%
26
Bioinformatics
1061 papers in training set
Top 10%
0.7%
27
Heliyon
146 papers in training set
Top 8%
0.7%
28
Journal of Bioinformatics and Systems Biology
14 papers in training set
Top 0.9%
0.7%
29
Scientific Reports
3102 papers in training set
Top 78%
0.7%
30
PLOS Computational Biology
1633 papers in training set
Top 28%
0.5%