Back

RANKOR: Direct Drug Prioritization from Bulk and Single-Cell Transcriptomic Signatures

Katsaouni, N.; Schulz, M. H.

2026-05-21 bioinformatics
10.64898/2026.05.20.726471 bioRxiv
Show abstract

BackgroundPrioritizing therapeutics from transcriptomic data remains a key challenge in precision medicine. Signature reversal approaches, most commonly implemented through Gene Set Enrichment Analysis (GSEA), have been widely used to match disease signatures to candidate drugs. However, enrichment-based methods can be sensitive to noise and are restricted to previously profiled compounds MethodsWe developed RANKOR, a machine-learning framework designed to rank candidate drugs directly from transcriptomic signatures. Rather than predicting full expression profiles, RANKOR learns structured latent representations of transcriptional responses alongside chemical structure, enabling prioritization from standardized signatures derived from disease states or treatment perturbations. The framework is applicable to both bulk and single-cell transcriptomic data. ResultsAcross large-scale perturbational datasets, RANKOR achieved consistently lower median ranks than similarity- and distance-based approaches, while showing performance comparable to, and in some settings improved over, GSEA. The model generalized across unseen cell types and retained performance in single-cell settings, where it provided more consistent prioritization than existing approaches, such as ASGARD. RANKOR further enabled prioritization of transcriptionally unseen compounds through chemical-space embedding and achieved substantially reduced computation times. Robustness analyses demonstrated stable performance under moderate noise and degradation under extreme perturbation or gene shuffling. Gene attribution analyses indicated that prioritization decisions are driven by coherent and mechanism-relevant transcriptional programs. ConclusionsRANKOR provides a scalable framework for transcriptomics-guided drug prioritization that can complement and extend existing approaches, such as GSEA. It can also support therapeutic hypothesis generation from bulk and single-cell data while leveraging the generalisability and computational efficiency of machine learning models.

Matching journals

The top 7 journals account for 50% of the predicted probability mass.

1
Genome Medicine
154 papers in training set
Top 0.2%
14.6%
2
Bioinformatics
1061 papers in training set
Top 2%
14.2%
3
Nature Communications
4913 papers in training set
Top 19%
10.0%
4
Bioinformatics Advances
184 papers in training set
Top 1%
3.6%
5
Clinical Pharmacology & Therapeutics
25 papers in training set
Top 0.2%
3.6%
6
Journal of the American Medical Informatics Association
61 papers in training set
Top 0.8%
3.6%
7
PLOS Computational Biology
1633 papers in training set
Top 11%
3.2%
50% of probability mass above
8
Cell Reports Medicine
140 papers in training set
Top 2%
2.4%
9
PLOS ONE
4510 papers in training set
Top 47%
2.1%
10
Scientific Reports
3102 papers in training set
Top 50%
2.1%
11
npj Systems Biology and Applications
99 papers in training set
Top 1.0%
1.8%
12
BMC Bioinformatics
383 papers in training set
Top 4%
1.8%
13
Briefings in Bioinformatics
326 papers in training set
Top 4%
1.7%
14
Patterns
70 papers in training set
Top 1%
1.7%
15
Advanced Science
249 papers in training set
Top 13%
1.5%
16
Proceedings of the National Academy of Sciences
2130 papers in training set
Top 35%
1.5%
17
Nature Machine Intelligence
61 papers in training set
Top 2%
1.5%
18
NAR Genomics and Bioinformatics
214 papers in training set
Top 2%
1.3%
19
JCO Clinical Cancer Informatics
18 papers in training set
Top 0.6%
1.3%
20
Computational and Structural Biotechnology Journal
216 papers in training set
Top 6%
1.2%
21
The Lancet Digital Health
25 papers in training set
Top 0.7%
1.1%
22
Genome Biology
555 papers in training set
Top 6%
0.9%
23
Nature Biomedical Engineering
42 papers in training set
Top 2%
0.9%
24
iScience
1063 papers in training set
Top 24%
0.9%
25
Cell Systems
167 papers in training set
Top 11%
0.9%
26
Journal of Chemical Information and Modeling
207 papers in training set
Top 3%
0.9%
27
Clinical and Translational Science
21 papers in training set
Top 1.0%
0.8%
28
npj Digital Medicine
97 papers in training set
Top 4%
0.7%
29
GigaScience
172 papers in training set
Top 3%
0.7%
30
BMC Medical Genomics
36 papers in training set
Top 1%
0.7%