Back

SMARTIE: A Machine-Learning approach for investigating RBP-RNA interactions identified by Editing

Koppaka, O.; Kumar, U.; Ahuja, G.; Yadav, R.; Bakthavachalu, B.

2026-05-19 bioinformatics
10.64898/2026.05.18.726004 bioRxiv
Show abstract

RNA-binding proteins (RBPs) play important roles in gene regulation. RNA editing-based approaches, such as TRIBE and STAMP, have gained wider use for identifying RNA targets of RBPs. These methods offer advantages over crosslinking-based approaches in terms of experimental simplicity and in vivo applicability. However, data analysis methods for these approaches remain underdeveloped, limiting sensitivity, and unbiased target prioritization. To address these limitations, we introduce SMARTIE (Systematic Machine-learning Approach for RBP Targets Identified by Editing), a machine-learning-based framework. SMARTIE robustly identifies and ranks RBP target RNAs from editing data by integrating statistical tests with replicate-aware and confidence-weighted features. Reanalysis of multiple published TRIBE datasets demonstrates the effectiveness of SMARTIE. It recovers targets of RBPs like Ataxin-2, TDP-43, Hrp48, Thor, GPATCH8, dFMRP and NonA. Notably, a model trained on TRIBE data generalizes to STAMP datasets, suggesting that SMARTIE learns universal signatures of editing-based RBP targeting there by enabling more accurate inference for RBP-RNA interactions. Graphical Abstract O_FIG O_LINKSMALLFIG WIDTH=200 HEIGHT=97 SRC="FIGDIR/small/726004v1_ufig1.gif" ALT="Figure 1"> View larger version (34K): org.highwire.dtl.DTLVardef@8b77e5org.highwire.dtl.DTLVardef@6c9416org.highwire.dtl.DTLVardef@6e33a5org.highwire.dtl.DTLVardef@100b7b5_HPS_FORMAT_FIGEXP M_FIG C_FIG

Matching journals

The top 5 journals account for 50% of the predicted probability mass.

1
Bioinformatics
1061 papers in training set
Top 1%
22.7%
2
Nucleic Acids Research
1128 papers in training set
Top 0.8%
14.8%
3
Genome Biology
555 papers in training set
Top 1%
6.4%
4
Bioinformatics Advances
184 papers in training set
Top 0.6%
4.9%
5
Cell Systems
167 papers in training set
Top 3%
4.3%
50% of probability mass above
6
Briefings in Bioinformatics
326 papers in training set
Top 1%
4.3%
7
Nature Communications
4913 papers in training set
Top 37%
4.0%
8
PLOS Computational Biology
1633 papers in training set
Top 9%
3.7%
9
NAR Genomics and Bioinformatics
214 papers in training set
Top 0.7%
3.6%
10
Cell Reports Methods
141 papers in training set
Top 0.8%
3.6%
11
Nature Methods
336 papers in training set
Top 3%
3.6%
12
Genome Research
409 papers in training set
Top 2%
2.5%
13
Nature Biotechnology
147 papers in training set
Top 4%
1.9%
14
Computational and Structural Biotechnology Journal
216 papers in training set
Top 4%
1.8%
15
Journal of Molecular Biology
217 papers in training set
Top 2%
1.3%
16
Advanced Science
249 papers in training set
Top 14%
1.2%
17
PLOS ONE
4510 papers in training set
Top 59%
1.2%
18
iScience
1063 papers in training set
Top 21%
1.2%
19
Cell Genomics
162 papers in training set
Top 5%
1.0%
20
RNA
169 papers in training set
Top 0.4%
0.8%
21
BMC Bioinformatics
383 papers in training set
Top 7%
0.7%
22
Genomics, Proteomics & Bioinformatics
171 papers in training set
Top 7%
0.5%
23
Proceedings of the National Academy of Sciences
2130 papers in training set
Top 49%
0.5%
24
Scientific Reports
3102 papers in training set
Top 80%
0.5%
25
Molecular Cell
308 papers in training set
Top 12%
0.5%
26
The American Journal of Human Genetics
206 papers in training set
Top 5%
0.5%