Back

Predicting Pre-treatment Resistance or Post-treatment Effect? A Systematic Benchmarking of Single-Cell Drug Response Models

Shen, L.; Sun, X.; Zheng, S.; Hashmi, A.; Eriksson, J.; Mustonen, H.; Seppänen, H.; Shen, B.; Li, M.; Vähä-Koskela, M.; Tang, J.

2026-04-14 bioinformatics
10.64898/2026.04.10.717709 bioRxiv
Show abstract

Intratumoral heterogeneity is a major driver of variable drug responses in cancer. Single-cell RNA sequencing (scRNA-seq) enables the characterization of such heterogeneity, providing an opportunity to predict drug response at single-cell resolution. As a result, a growing number of computational models have been developed to infer drug response from scRNA-seq datasets. However, their performance, robustness, and generalizability across different biological contexts have not been systematically evaluated. To address this gap, we conducted a comprehensive benchmarking of representative single-cell drug response prediction models. Using 26 curated datasets comprising over 760,000 cells across 12 cancer types and 21 therapeutic agents, we constructed balanced and imbalanced scenarios to reflect more realistic distributions of drug response labels. To address the lack of ground-truth drug-response labels in conventional scRNA-seq datasets, we further incorporated lineage-tracing data with experimentally validated drug-response annotations, enabling model evaluation in a clinically relevant pre-treatment prediction setting. Our results show that across the tested methods, the prediction performance is markedly higher in cell lines than in tissue samples. Under imbalanced conditions, most methods exhibited sharp performance declines, whereas scDEAL demonstrated the highest robustness. Independent validation using an in-house pancreatic ductal adenocarcinoma dataset further confirms the robustness of scDEAL and its ability to capture biologically meaningful state transitions. Label-substitution experiment revealed that this robust performance partially driven by the models specific training label construction. However, the benchmarking with lineage-tracing data reveals a fundamental limitation: most models capture drug-induced transcriptional changes but struggle to predict a cells intrinsic resistance state prior to treatment. In summary, our study not only defines the performance boundaries of current approaches but also highlights their limitations in addressing intratumoral heterogeneity, extreme class imbalance, and the prediction of intrinsic cellular resistance, emphasizing the need for the development of next-generation single-cell drug response models with stronger clinical relevance.

Matching journals

The top 7 journals account for 50% of the predicted probability mass.

1
Briefings in Bioinformatics
326 papers in training set
Top 0.1%
17.2%
2
PLOS Computational Biology
1633 papers in training set
Top 4%
9.0%
3
Genome Medicine
154 papers in training set
Top 0.7%
8.2%
4
Nucleic Acids Research
1128 papers in training set
Top 3%
6.2%
5
npj Systems Biology and Applications
99 papers in training set
Top 0.3%
4.8%
6
Scientific Reports
3102 papers in training set
Top 29%
4.2%
7
Nature Communications
4913 papers in training set
Top 36%
4.2%
50% of probability mass above
8
NAR Genomics and Bioinformatics
214 papers in training set
Top 0.8%
3.5%
9
Nature Machine Intelligence
61 papers in training set
Top 1%
2.5%
10
Bioinformatics
1061 papers in training set
Top 6%
2.5%
11
Cancer Research Communications
46 papers in training set
Top 0.3%
2.1%
12
Genomics, Proteomics & Bioinformatics
171 papers in training set
Top 3%
2.0%
13
BMC Bioinformatics
383 papers in training set
Top 5%
1.7%
14
iScience
1063 papers in training set
Top 16%
1.7%
15
Advanced Science
249 papers in training set
Top 12%
1.6%
16
Computational and Structural Biotechnology Journal
216 papers in training set
Top 5%
1.6%
17
Cell Systems
167 papers in training set
Top 8%
1.5%
18
Communications Biology
886 papers in training set
Top 13%
1.3%
19
PLOS ONE
4510 papers in training set
Top 61%
1.2%
20
International Journal of Molecular Sciences
453 papers in training set
Top 11%
1.2%
21
Frontiers in Molecular Biosciences
100 papers in training set
Top 3%
1.1%
22
Cell Reports Medicine
140 papers in training set
Top 6%
0.9%
23
Science Advances
1098 papers in training set
Top 27%
0.9%
24
eLife
5422 papers in training set
Top 59%
0.7%
25
Bioinformatics Advances
184 papers in training set
Top 5%
0.7%
26
Computers in Biology and Medicine
120 papers in training set
Top 5%
0.7%
27
Cell Genomics
162 papers in training set
Top 7%
0.7%
28
Genome Biology
555 papers in training set
Top 8%
0.7%
29
Nature Biomedical Engineering
42 papers in training set
Top 3%
0.6%
30
Frontiers in Bioinformatics
45 papers in training set
Top 1%
0.6%