Back

Accounting for fragments of unexpectedorigin improves transcript quantification inRNA-seq simulations focused on increased realism

Srivastava, A.; Zakeri, M.; Sarkar, H.; Soneson, C.; Kingsford, C.; Patro, R.

2021-01-19 bioinformatics
10.1101/2021.01.17.426996 bioRxiv
Show abstract

Transcript and gene quantification is the first step in many RNA-seq analyses. While many factors and properties of experimental RNA-seq data likely contribute to differences in accuracy between various approaches to quantification, it has been demonstrated (1) that quantification accuracy generally benefits from considering, during alignment, potential genomic origins for sequenced fragments that reside outside of the annotated transcriptome. Recently, Varabyou et al. (2) demonstrated that the presence of transcriptional noise leads to systematic errors in the ability of tools -- particularly annotation-based ones -- to accurately estimate transcript expression. Here, we confirm the findings of Varabyou et al. (2) using the simulation framework they have provided. Using the same data, we also examine the methodology of Srivastava et al.(1) as implemented in recent versions of salmon (3), and show that it substantially enhances the accuracy of annotation-based transcript quantification in these data.

Matching journals

The top 4 journals account for 50% of the predicted probability mass.

1
NAR Genomics and Bioinformatics
214 papers in training set
Top 0.1%
18.4%
2
Bioinformatics
1061 papers in training set
Top 2%
18.4%
3
PLOS Computational Biology
1633 papers in training set
Top 4%
8.3%
4
BMC Bioinformatics
383 papers in training set
Top 2%
6.2%
50% of probability mass above
5
Bioinformatics Advances
184 papers in training set
Top 0.7%
4.8%
6
Molecular Biology and Evolution
488 papers in training set
Top 1%
4.3%
7
RNA
169 papers in training set
Top 0.1%
3.2%
8
Computational and Structural Biotechnology Journal
216 papers in training set
Top 3%
2.6%
9
Biophysical Journal
545 papers in training set
Top 2%
2.1%
10
Journal of Bioinformatics and Systems Biology
14 papers in training set
Top 0.1%
1.9%
11
PeerJ
261 papers in training set
Top 6%
1.9%
12
G3 Genes|Genomes|Genetics
351 papers in training set
Top 1%
1.8%
13
Nucleic Acids Research
1128 papers in training set
Top 11%
1.7%
14
PLOS ONE
4510 papers in training set
Top 55%
1.7%
15
BMC Research Notes
29 papers in training set
Top 0.1%
1.6%
16
Briefings in Bioinformatics
326 papers in training set
Top 5%
1.3%
17
G3: Genes, Genomes, Genetics
222 papers in training set
Top 0.6%
1.2%
18
mSystems
361 papers in training set
Top 6%
1.2%
19
Genetics
225 papers in training set
Top 3%
1.1%
20
Scientific Reports
3102 papers in training set
Top 70%
0.9%
21
F1000Research
79 papers in training set
Top 3%
0.9%
22
Frontiers in Genetics
197 papers in training set
Top 8%
0.9%
23
GigaScience
172 papers in training set
Top 3%
0.9%
24
Physical Biology
43 papers in training set
Top 2%
0.7%
25
Proceedings of the National Academy of Sciences
2130 papers in training set
Top 45%
0.7%
26
iScience
1063 papers in training set
Top 35%
0.7%
27
Genome Biology
555 papers in training set
Top 9%
0.6%