Back
Deep learning models for chemical perturbation prediction do not yet utilise drug molecular features
Bai, J.; Prince, S.; Nitschke, G. S.
2026-05-15
bioinformatics
10.64898/2026.05.13.724458
bioRxiv
Show abstract
Recent deep learning models for L1000 chemical perturbation prediction incorporate dedicated drug molecular encoders. We retrained seven such models from scratch with zeroed or shuffled drug inputs, and compared them with a multilayer perceptron that uses only cell-line basal expression. Under drug-blind evaluation, ablation caused negligible performance changes and the drug-free baseline matched all models. Current architectures do not yet utilise drug molecular features for generalisation to unseen compounds.
Matching journals
●Non-profit
◐University press
○Commercial
The top 6 journals account for 50% of the predicted probability mass.
1
Nature Communications
○
4913 papers in training set
Top 13%
12.9%
2
Nature Machine Intelligence
○
61 papers in training set
Top 0.1%
10.8%
3
Bioinformatics
◐
1061 papers in training set
Top 3%
7.4%
4
Journal of Chemical Information and Modeling
●
207 papers in training set
Top 0.7%
7.4%
5
Scientific Reports
○
3102 papers in training set
Top 13%
7.0%
6
Journal of Cheminformatics
○
25 papers in training set
Top 0.1%
6.6%
50% of probability mass above
7
Briefings in Bioinformatics
◐
326 papers in training set
Top 1%
4.5%
8
PLOS ONE
●
4510 papers in training set
Top 45%
2.4%
9
Artificial Intelligence in the Life Sciences
○
11 papers in training set
Top 0.1%
2.1%
10
PLOS Computational Biology
●
1633 papers in training set
Top 13%
2.1%
11
eLife
●
5422 papers in training set
Top 37%
1.9%
12
BMC Bioinformatics
○
383 papers in training set
Top 4%
1.9%
13
npj Systems Biology and Applications
○
99 papers in training set
Top 1%
1.7%
14
Proceedings of the National Academy of Sciences
●
2130 papers in training set
Top 31%
1.7%
15
Metabolites
○
50 papers in training set
Top 0.7%
1.3%
16
Nature Methods
○
336 papers in training set
Top 5%
1.0%
17
Bioinformatics Advances
◐
184 papers in training set
Top 4%
1.0%
18
Communications Biology
○
886 papers in training set
Top 18%
0.9%
19
Chemical Science
●
71 papers in training set
Top 2%
0.8%
20
Frontiers in Molecular Biosciences
○
100 papers in training set
Top 4%
0.8%
21
Computational and Structural Biotechnology Journal
●
216 papers in training set
Top 9%
0.7%
22
Patterns
○
70 papers in training set
Top 3%
0.7%
23
Genome Medicine
○
154 papers in training set
Top 8%
0.7%
24
Molecules
○
37 papers in training set
Top 2%
0.7%
25
Cancers
○
200 papers in training set
Top 5%
0.7%
26
Frontiers in Pharmacology
○
100 papers in training set
Top 5%
0.7%
27
Clinical Pharmacology & Therapeutics
○
25 papers in training set
Top 0.9%
0.7%
28
iScience
○
1063 papers in training set
Top 39%
0.5%
29
BioData Mining
○
15 papers in training set
Top 1%
0.5%