Back

PERREO: An integrated pipeline for repetitive elements analysis enables the repeatome expression profiling in cancer

Rodriguez-Martin, F.; Masero-Leon, M.; Gomez-Cabello, D.

2026-04-10 bioinformatics
10.64898/2026.04.08.714730 bioRxiv
Show abstract

Transcriptome-wide profiling of repetitive elements expression reveals transposable element-derived transcripts that are deregulated in diverse biological contexts including cancer. However, most RNA-seq pipelines are optimized for annotated genes and substantially undercount repeat RNA molecules, limiting their discovery and characterization. Here we present PERREO, a comprehensive, user-friendly pipeline for analyzing repetitive RNA elements from short- and long-read sequencing data. PERREO performs quality control, repeat-aware alignment and quantification, differential expression analysis, co-expression network analysis, and de novo transcript assembly with minimal computational expertise required. We validate PERREO across cell lines, tumor tissues and liquid biopsies, demonstrating superior sensitivity to repetitive RNA signatures compared with standard RNA-seq approaches. PERREO integrates predictive modelling to identify biological associations and generates publication-ready visualizations. By removing the bioinformatic barrier to repetitive RNA discovery, this pipeline enables broader investigation of the repeatomes role in cellular biology and disease, yielding valuable results that, for specific analytical objectives, outperform certain existing tools and pipelines.

Matching journals

The top 4 journals account for 50% of the predicted probability mass.

1
Nature Biotechnology
147 papers in training set
Top 0.2%
18.1%
2
Nucleic Acids Research
1128 papers in training set
Top 0.7%
17.1%
3
Genome Biology
555 papers in training set
Top 0.3%
12.0%
4
Nature Communications
4913 papers in training set
Top 25%
7.0%
50% of probability mass above
5
Nature Methods
336 papers in training set
Top 2%
6.2%
6
Genome Research
409 papers in training set
Top 1%
3.5%
7
Bioinformatics
1061 papers in training set
Top 6%
3.5%
8
Genome Medicine
154 papers in training set
Top 3%
3.0%
9
Nature
575 papers in training set
Top 9%
2.5%
10
Cell Systems
167 papers in training set
Top 5%
2.3%
11
Nature Genetics
240 papers in training set
Top 4%
2.0%
12
Cell Genomics
162 papers in training set
Top 3%
1.7%
13
NAR Genomics and Bioinformatics
214 papers in training set
Top 2%
1.7%
14
Bioinformatics Advances
184 papers in training set
Top 3%
1.7%
15
Cell Reports Methods
141 papers in training set
Top 3%
1.3%
16
Briefings in Bioinformatics
326 papers in training set
Top 5%
1.2%
17
Genomics, Proteomics & Bioinformatics
171 papers in training set
Top 5%
1.2%
18
Advanced Science
249 papers in training set
Top 17%
0.9%
19
Science
429 papers in training set
Top 18%
0.9%
20
BMC Bioinformatics
383 papers in training set
Top 7%
0.8%
21
PLOS Computational Biology
1633 papers in training set
Top 24%
0.8%
22
Mobile DNA
27 papers in training set
Top 0.2%
0.7%
23
Proceedings of the National Academy of Sciences
2130 papers in training set
Top 45%
0.7%
24
PLOS ONE
4510 papers in training set
Top 69%
0.7%
25
iScience
1063 papers in training set
Top 34%
0.7%
26
Scientific Reports
3102 papers in training set
Top 79%
0.6%