Back

Comprehensive mRNA annotation in trypanosomatid parasites

Dobramysl, U.; Wheeler, R. J.

2026-02-25 genomics
10.64898/2026.02.24.707742 bioRxiv
Show abstract

Trypanosomatid parasites, including human infective Leishmania and Trypanosoma species, have an unusual genome organisation and transcription. They are unicellular eukaryotes, but unlike most eukaryotes, which have individual promoters per gene, most protein coding genes are co-transcribed in long gene arrays. This nascent transcript is processed into individual mRNAs by trans-splicing and polyadenylation. Accurate analysis of transcription, transcript processing and transcript abundance requires accurate genome annotation of spliced leader acceptor sites, polyadenylation sites and the resulting 5' and 3' mRNA untranslated regions. Here, we describe tools for annotating these features from short read RNA sequencing data and for measuring the usage of spliced leader acceptor and polyadenylation sites. These are practical, scalable software packages, and we use them to annotate UTRs across all available trypanosomatid genomes.

Matching journals

The top 6 journals account for 50% of the predicted probability mass.

1
Bioinformatics
1061 papers in training set
Top 2%
14.4%
2
Nucleic Acids Research
1128 papers in training set
Top 2%
10.2%
3
PLOS Computational Biology
1633 papers in training set
Top 3%
10.2%
4
NAR Genomics and Bioinformatics
214 papers in training set
Top 0.1%
10.2%
5
BMC Bioinformatics
383 papers in training set
Top 2%
4.9%
6
PLOS ONE
4510 papers in training set
Top 31%
4.9%
50% of probability mass above
7
PLOS Neglected Tropical Diseases
378 papers in training set
Top 2%
4.0%
8
PLOS Pathogens
721 papers in training set
Top 4%
2.9%
9
BMC Genomics
328 papers in training set
Top 1%
2.7%
10
G3 Genes|Genomes|Genetics
351 papers in training set
Top 0.9%
2.6%
11
Nature Communications
4913 papers in training set
Top 46%
2.1%
12
PLOS Genetics
756 papers in training set
Top 7%
2.1%
13
Scientific Reports
3102 papers in training set
Top 50%
2.1%
14
Genetics
225 papers in training set
Top 2%
2.1%
15
Genome Research
409 papers in training set
Top 2%
2.1%
16
Genomics
60 papers in training set
Top 1%
1.7%
17
Molecular Biology and Evolution
488 papers in training set
Top 2%
1.7%
18
iScience
1063 papers in training set
Top 21%
1.2%
19
Proceedings of the National Academy of Sciences
2130 papers in training set
Top 41%
0.9%
20
RNA
169 papers in training set
Top 0.4%
0.9%
21
Communications Biology
886 papers in training set
Top 18%
0.9%
22
Frontiers in Genetics
197 papers in training set
Top 8%
0.9%
23
Genome Biology
555 papers in training set
Top 6%
0.9%
24
BMC Biology
248 papers in training set
Top 4%
0.8%
25
eLife
5422 papers in training set
Top 57%
0.8%
26
Bioinformatics Advances
184 papers in training set
Top 5%
0.7%
27
Scientific Data
174 papers in training set
Top 3%
0.6%
28
GigaScience
172 papers in training set
Top 4%
0.5%