Back

Common Pitfalls in CircRNA Detection and Quantification

Weyrich, M.; Trummer, N.; Boehm, F.; Furth, P. A.; Hoffmann, M.; List, M.

2026-02-04 bioinformatics
10.64898/2026.02.02.703185 bioRxiv
Show abstract

1Circular RNAs have garnered considerable interest, as they have been implicated in numerous biological processes and diseases. Through their stability, they are often considered promising biomarker candidates or therapeutic targets. Due to the lack of a poly(A) tail, circRNAs are best detected in total RNA-seq data after depleting ribosomal RNA. However, we observe that the application of circRNA detection in the vastly more ubiquitous poly(A)-enriched RNA-seq data still occurs. In this study, we systematically compare the detection of circRNAs in two matched poly(A) and ribosomal RNA-depleted data sets. Our results indicate that the comparably few circRNAs detected in poly(A) data are likely false positives. In addition, we demonstrate that the quality of sample processing, as measured by the fraction of ribosomal reads, significantly affects the sensitivity of circRNA detection, leading to a bias in downstream analysis. Our findings establish best practices for circRNA research: total RNA sequencing with effective rRNA depletion is the preferred approach for accurate circRNA profiling, whereas poly(A)-enriched data are unsuitable for comprehensive detection. Employing multiple circRNA detection tools and prioritizing back-splice junctions identified by several algorithms enhances confidence in the selection of candidates. These recommendations, validated across diverse datasets and tissue types, provide generalizable principles for robust circRNA analysis. Key PointsO_LIRibosomal RNA contamination substantially impairs the accuracy of circRNA detection. This technical confounding factor has thus far received limited attention in the field. C_LIO_LITool agreement for circRNA calls is moderate in total RNA-seq but essentially absent in poly(A)-enriched RNA-seq data, underscoring the importance of using multiple tools for circRNA detection. C_LIO_LIBack-splice junctions detected in poly(A)-enriched RNA-seq data are predominantly tool-specific artifacts rather than genuine circRNAs, challenging the validity of circRNA identification in poly(A)-enriched datasets. C_LI

Matching journals

The top 5 journals account for 50% of the predicted probability mass.

1
RNA
169 papers in training set
Top 0.1%
25.9%
2
PLOS ONE
4510 papers in training set
Top 25%
6.8%
3
RNA Biology
70 papers in training set
Top 0.1%
6.4%
4
Scientific Reports
3102 papers in training set
Top 19%
6.3%
5
NAR Genomics and Bioinformatics
214 papers in training set
Top 0.3%
4.9%
50% of probability mass above
6
BMC Bioinformatics
383 papers in training set
Top 2%
4.0%
7
Nucleic Acids Research
1128 papers in training set
Top 5%
4.0%
8
PeerJ
261 papers in training set
Top 2%
3.6%
9
Bioinformatics
1061 papers in training set
Top 5%
3.6%
10
BMC Genomics
328 papers in training set
Top 0.9%
3.6%
11
Computational and Structural Biotechnology Journal
216 papers in training set
Top 2%
3.1%
12
Briefings in Bioinformatics
326 papers in training set
Top 3%
2.1%
13
PLOS Computational Biology
1633 papers in training set
Top 16%
1.7%
14
Genome Biology
555 papers in training set
Top 4%
1.7%
15
Genome Research
409 papers in training set
Top 2%
1.7%
16
International Journal of Molecular Sciences
453 papers in training set
Top 10%
1.3%
17
Frontiers in Genetics
197 papers in training set
Top 7%
1.2%
18
Bioinformatics Advances
184 papers in training set
Top 4%
1.0%
19
Nature Communications
4913 papers in training set
Top 59%
0.9%
20
Biology Methods and Protocols
53 papers in training set
Top 2%
0.8%
21
Journal of Molecular Biology
217 papers in training set
Top 4%
0.7%
22
Journal of Bioinformatics and Systems Biology
14 papers in training set
Top 0.7%
0.7%
23
iScience
1063 papers in training set
Top 34%
0.7%
24
Cell Reports Methods
141 papers in training set
Top 6%
0.6%