Back

Teaching Diffusion Models Physics: Reinforcement Learning for Physically Valid Diffusion-Based Docking

Broster, J. H.; Popovic, B.; Kondinskaia, D.; Deane, C. M.; Imrie, F.

2026-03-27 bioinformatics
10.64898/2026.03.25.714128 bioRxiv
Show abstract

Molecular docking aims to predict the binding conformation of a small molecule to its protein target. Recent work has proposed diffusion models for this task, from rigid-body docking that diffuses over ligand degrees of freedom to co-folding approaches that jointly generate protein structure and ligand pose. However, diffusion-based docking models have been shown to frequently produce physically implausible poses and fail to consistently recover key protein-ligand interactions. To address this, we introduce a reinforcement learning framework for training diffusion-based docking models directly on non-differentiable objectives. Fine-tuning DiffDock-Pocket for physical validity with our approach substantially increases the number of generated poses that are physically valid and interaction-preserving, with no increase in inference-time compute. Importantly, this comes without sacrificing structural accuracy; in fact, our approach increases the proportion of structures with near-native poses. These effects are most pronounced for protein targets that are dissimilar to the training data. Our fine-tuned DiffDock-Pocket model outperforms both classical docking algorithms and machine learning-based approaches on the PoseBusters set. Our results demonstrate that reinforcement learning can teach diffusion-based docking models to better respect physical constraints and recover key interactions, without the requirement to rely on inference-time corrections.

Matching journals

The top 3 journals account for 50% of the predicted probability mass.

1
Bioinformatics
1061 papers in training set
Top 0.4%
40.6%
2
Journal of Chemical Information and Modeling
207 papers in training set
Top 1.0%
5.0%
3
PLOS Computational Biology
1633 papers in training set
Top 7%
5.0%
50% of probability mass above
4
Nature Communications
4913 papers in training set
Top 35%
4.3%
5
Proteins: Structure, Function, and Bioinformatics
82 papers in training set
Top 0.1%
4.1%
6
Journal of Cheminformatics
25 papers in training set
Top 0.1%
3.8%
7
Cell Systems
167 papers in training set
Top 3%
3.7%
8
Scientific Reports
3102 papers in training set
Top 34%
3.7%
9
Proceedings of the National Academy of Sciences
2130 papers in training set
Top 26%
2.5%
10
Briefings in Bioinformatics
326 papers in training set
Top 3%
2.1%
11
Bioinformatics Advances
184 papers in training set
Top 3%
1.8%
12
Journal of Chemical Theory and Computation
126 papers in training set
Top 0.6%
1.5%
13
Frontiers in Molecular Biosciences
100 papers in training set
Top 2%
1.4%
14
Nature Machine Intelligence
61 papers in training set
Top 2%
1.3%
15
BMC Bioinformatics
383 papers in training set
Top 5%
1.3%
16
Protein Science
221 papers in training set
Top 1%
0.9%
17
Structure
175 papers in training set
Top 3%
0.9%
18
The Journal of Physical Chemistry B
158 papers in training set
Top 2%
0.8%
19
Nucleic Acids Research
1128 papers in training set
Top 16%
0.8%
20
Communications Biology
886 papers in training set
Top 22%
0.8%
21
Protein Engineering, Design and Selection
14 papers in training set
Top 0.1%
0.8%
22
Chemical Science
71 papers in training set
Top 2%
0.8%
23
ACS Omega
90 papers in training set
Top 5%
0.7%
24
PRX Life
34 papers in training set
Top 1%
0.7%
25
Journal of Molecular Biology
217 papers in training set
Top 5%
0.5%
26
National Science Review
22 papers in training set
Top 3%
0.5%