Back

Circumventing the synthesizability problem in generative molecular design

Weller, J. A.; Li, J.; Jiang, Y.; Rohs, R.

2026-02-19 bioinformatics
10.64898/2026.02.18.706722 bioRxiv
Show abstract

Generative structure-based drug design (SBDD) models have shown great promise to accelerate our ability to discover novel drug candidates. However, these models have been criticized for producing compounds that are not very synthesizable, and therefore not practically applicable to drug design. In this work, we propose a way to circumvent the synthesizability issue by introducing a model-guided virtual screening (MGVS) pipeline which pairs SBDD models with efficient chemical similarity search methods to identify synthesizable analogs of generated compounds in existing ultra-large compound databases. Using this approach, we demonstrate that synthesizable analogs of generated compounds with equivalent or better docking scores and similar predicted binding poses can be reliably identified across a wide range of protein targets. We find that MGVS outperforms standard virtual ligand screening (VLS), consistently yielding at least a 25x improvement in screening efficiency across three different SBDD models. As drug-like chemical spaces continue to grow and standard VLS methods focused on exhaustive screening become increasingly impractical, approaches like MGVS that effectively narrow the search space will become critical for advancing drug discovery.

Matching journals

The top 1 journal accounts for 50% of the predicted probability mass.