Back

Scalable prediction of symmetric protein complex structures

Yu, V. S.; Demsko, P.; Castells-Graells, R.; Parker, H.; Huang, A.; Chen, C.; Huang, M.; Srinivasan, V.; Ajjarapu, K.; Tofighbakhsh, N.; Yu, R.; Lake, M.; Glanzman, D.; Warren, S.; Alzagatiti, J.

2026-02-05 bioengineering

10.1101/2025.11.14.688531 bioRxiv

Show abstract

All life relies on proteins to function, yet accurately modeling protein structures that exceed {approx} 10, 000 amino acids or have higher-order geometries remains difficult. Existing solutions are limited to specific scenarios, require considerable computational resources, or are otherwise unscalable. Consequently, many large, disease-relevant protein complexes in the human proteome, as well as nearly all viruses and numerous other classes, are impractical to model with high fidelity for drug development. To modulate these protein complexes and viruses, structural information is eminently valuable, and often essential. In the last two years, machine learning based-tools that can generate binders to a given target structure with high hit rates have emerged. Combined with high-throughput screening, these technologies can far outpace traditional drug discovery. However, they cannot function well without accurate models of their target structures. Thus, to unlock the full power of AI-driven drug discovery, a scalable method must be developed to predict large protein complex structures. To overcome this bottleneck, we introduce Plica-1, a physics-based method to rapidly and accurately predict the structure of arbitrarily large, symmetric protein complexes. Validated across 4 major symmetry classes (icosahedral, tetrahedral, octahedral, and cyclic), the method consistently achieves near-experimental levels of accuracy, i.e., RMSD < 5[A]. In test cases, the method runs in < 5 minutes on consumer hardware, 103-105 times faster than the closest comparable software. The largest structure currently built, at {approx}40,000 amino acids, is > 8 times the limit of existing machine learning methods. The results demonstrate that protein complexes can be modeled at significantly improved speeds and scales, making Plica-1 a promising tool for protein engineering and drug development.

Scalable prediction of symmetric protein complex structures

Matching journals