Back

A chemoinformatics-guided platform for efficient discovery of RNA-binding small molecules: Proof-of-concept for myotonic dystrophy type 1

taghavi, a.; Shan, J.; Yao, X.; Zanon, P. R. A.; Sung, K.; Simba-Lahuas, A.; Gorlach, S.; Labuhn, H.; Salthouse, D.; Wang, Z.; Feri, A.; Disney, M. D.

2026-05-13 bioinformatics
10.64898/2026.05.08.723748 bioRxiv
Show abstract

Structured RNAs cause human diseases but remain challenging to target selectively with small molecules. Here, we report a chemoinformatics-guided discovery framework that integrates fingerprint-based molecular design, experimental validation, and mechanistic profiling to identify small molecules that bind highly structured, disease-associated RNAs. Using an RNA-binder fingerprint derived from known ligands, a Tversky similarity screen of >8 million compounds yielded a 150-member library enriched in chemical space for RNA-active scaffolds. Target engagement and cell-based assays identified multiple selective ligands for the pathogenic expanded triplet repeat, r(CUG)exp, that causes myotonic dystrophy type 1 (DM1) by binding and sequestering the RNA-binding protein muscleblind-like 1 (MBNL1). Biophysical and single-molecule analyses revealed that the small molecules bind the 1x1 nucleotide U/U internal loops formed when r(CUG)exp folds, partially block MBNL1 binding, and modulate RNA folding equilibria. Two optimized scaffolds rescued MBNL1-dependent splicing in patient-derived myotubes with micromolar potency and minimal cytotoxicity. This study establishes a generalizable, data-driven platform for discovering drug-like RNA-binding lead small molecules and demonstrates its application to the toxic repeat expansion RNA underlying DM1. O_FIG O_LINKSMALLFIG WIDTH=200 HEIGHT=97 SRC="FIGDIR/small/723748v1_ufig1.gif" ALT="Figure 1"> View larger version (24K): org.highwire.dtl.DTLVardef@1a87b41org.highwire.dtl.DTLVardef@340a14org.highwire.dtl.DTLVardef@81b583org.highwire.dtl.DTLVardef@1b3ba14_HPS_FORMAT_FIGEXP M_FIG Graphical Abstract C_FIG

Matching journals

The top 6 journals account for 50% of the predicted probability mass.

1
Advanced Science
249 papers in training set
Top 0.4%
18.4%
2
Nature Communications
4913 papers in training set
Top 15%
12.2%
3
Cell Chemical Biology
81 papers in training set
Top 0.3%
7.1%
4
Nature Chemical Biology
104 papers in training set
Top 0.3%
6.3%
5
Nucleic Acids Research
1128 papers in training set
Top 4%
4.8%
6
Nature Biotechnology
147 papers in training set
Top 2%
3.9%
50% of probability mass above
7
Molecular Therapy Nucleic Acids
32 papers in training set
Top 0.2%
3.5%
8
ACS Chemical Biology
150 papers in training set
Top 0.5%
3.5%
9
Cell Genomics
162 papers in training set
Top 2%
2.7%
10
Proceedings of the National Academy of Sciences
2130 papers in training set
Top 30%
1.9%
11
Journal of Chemical Information and Modeling
207 papers in training set
Top 2%
1.7%
12
Chemical Science
71 papers in training set
Top 1%
1.6%
13
Computational and Structural Biotechnology Journal
216 papers in training set
Top 5%
1.6%
14
Cell Systems
167 papers in training set
Top 8%
1.6%
15
Acta Pharmaceutica Sinica B
11 papers in training set
Top 0.5%
1.5%
16
Nature Methods
336 papers in training set
Top 5%
1.1%
17
Cell Reports Physical Science
18 papers in training set
Top 0.5%
0.9%
18
Communications Chemistry
39 papers in training set
Top 0.7%
0.9%
19
eLife
5422 papers in training set
Top 54%
0.9%
20
Cell Reports Methods
141 papers in training set
Top 5%
0.8%
21
Journal of the American Chemical Society
199 papers in training set
Top 5%
0.8%
22
Nature Machine Intelligence
61 papers in training set
Top 4%
0.7%
23
Angewandte Chemie International Edition
81 papers in training set
Top 4%
0.7%
24
Protein & Cell
25 papers in training set
Top 2%
0.7%
25
iScience
1063 papers in training set
Top 33%
0.7%
26
Cell Reports Medicine
140 papers in training set
Top 9%
0.7%
27
Briefings in Bioinformatics
326 papers in training set
Top 7%
0.7%
28
NAR Molecular Medicine
18 papers in training set
Top 0.4%
0.6%