Common Pitfalls in CircRNA Detection and Quantification
Weyrich, M.; Trummer, N.; Boehm, F.; Furth, P. A.; Hoffmann, M.; List, M.
Show abstract
1Circular RNAs have garnered considerable interest, as they have been implicated in numerous biological processes and diseases. Through their stability, they are often considered promising biomarker candidates or therapeutic targets. Due to the lack of a poly(A) tail, circRNAs are best detected in total RNA-seq data after depleting ribosomal RNA. However, we observe that the application of circRNA detection in the vastly more ubiquitous poly(A)-enriched RNA-seq data still occurs. In this study, we systematically compare the detection of circRNAs in two matched poly(A) and ribosomal RNA-depleted data sets. Our results indicate that the comparably few circRNAs detected in poly(A) data are likely false positives. In addition, we demonstrate that the quality of sample processing, as measured by the fraction of ribosomal reads, significantly affects the sensitivity of circRNA detection, leading to a bias in downstream analysis. Our findings establish best practices for circRNA research: total RNA sequencing with effective rRNA depletion is the preferred approach for accurate circRNA profiling, whereas poly(A)-enriched data are unsuitable for comprehensive detection. Employing multiple circRNA detection tools and prioritizing back-splice junctions identified by several algorithms enhances confidence in the selection of candidates. These recommendations, validated across diverse datasets and tissue types, provide generalizable principles for robust circRNA analysis. Key PointsO_LIRibosomal RNA contamination substantially impairs the accuracy of circRNA detection. This technical confounding factor has thus far received limited attention in the field. C_LIO_LITool agreement for circRNA calls is moderate in total RNA-seq but essentially absent in poly(A)-enriched RNA-seq data, underscoring the importance of using multiple tools for circRNA detection. C_LIO_LIBack-splice junctions detected in poly(A)-enriched RNA-seq data are predominantly tool-specific artifacts rather than genuine circRNAs, challenging the validity of circRNA identification in poly(A)-enriched datasets. C_LI
Matching journals
The top 5 journals account for 50% of the predicted probability mass.