BioPipelines: Accessible Computational Protein and Ligand Design for Chemical Biologists
Quargnali, G.; Rivera-Fuentes, P.
Show abstract
Deep learning methods for protein structure generation, sequence design, and structure and property prediction have created unprecedented opportunities for protein engineering and drug discovery. However, using these tools often requires navigating incompatible software environments, diverse input/output formats, and high-performance computing infrastructure, any of which may hinder adoption by primarily experimental chemical biology laboratories. Here we present BioPipelines, an open-source Python framework that allows researchers to define multi-step computational design workflows in a few lines of code. Additionally, its robust yet modular architecture provides a straightforward way to expand the toolkit with different functionalities, particularly by leveraging coding agents, with little effort. The framework currently integrates over 30 tools encompassing structure generation, sequence design, structure prediction, compound screening, and analysis. The same workflow code can be prototyped interactively in a Jupyter notebook and then submitted for production-scale runs without modification. We demonstrate applications in inverse folding, gene synthesis, de novo protein design, compound library screening, iterative binding site optimization, and fusion-protein linker optimization. We hope this framework will empower researchers, allowing them to focus on the scientific question rather than computational logistics. BioPipelines is available under the MIT license at https://github.com/locbp-uzh/biopipelines.
Matching journals
The top 3 journals account for 50% of the predicted probability mass.