Back

Assessment of Protein Complex Predictions in CASP16: Are we making progress?

Zhang, J.; Yuan, R.; Kryshtafovych, A.; Kretsch, R. C.; Schaeffer, R. D.; Zhou, J.; Das, R.; Grishin, N. V.; Cong, Q.

2025-05-30 biophysics
10.1101/2025.05.29.656875 bioRxiv
Show abstract

The assessment of oligomer targets in the Critical Assessment of Structure Prediction Round 16 (CASP16) suggests that complex structure prediction remains an unsolved challenge. More than 30% of targets, particularly antibody-antigen targets, were highly challenging, with each group correctly predicting structures for only about a quarter of such targets. Most CASP16 groups relied on AlphaFold-Multimer (AFM) or AlphaFold3 (AF3) as their core modeling engines. By optimizing input MSAs, refining modeling constructs (using partial rather than full sequences), and employing massive model sampling and selection, top-performing groups were able to significantly outperform the default AFM/AF3 predictions. CASP16 also introduced two additional challenges: Phase 0, which required predictions without stoichiometry information, and Phase 2, which provided participants with thousands of models generated by MassiveFold (MF) to enable large-scale sampling for resource-limited groups. Across all phases, the MULTICOM series and Kiharalab emerged as top performers based on the quality of their best models per target. However, these groups did not have a strong advantage in model ranking, and thus their lead over other teams, such as Yang-Multimer and kozakovvajda, was less pronounced when evaluating only the first submitted models. Compared to CASP15, CASP16 showed moderate overall improvement, likely driven by the release of AF3 and the extensive model sampling employed by top groups. Several notable trends highlight key frontiers for future development. First, the kozakovvajda group significantly outperformed others on antibody-antigen targets, achieving over a 60% success rate without relying on AFM or AF3 as their primary modeling framework, suggesting that alternative approaches may offer promising solutions for these difficult targets. Second, model ranking and selection continue to be major bottlenecks. The PEZYFoldings group demonstrated a notable advantage in selecting their best models as first models, suggesting that their pipeline for model ranking may offer important insights for the field. Finally, the Phase 0 experiment indicated reasonable success in stoichiometry prediction; however, stoichiometry prediction remains challenging for high-order assemblies and targets that differ from available homologous templates. Overall, CASP16 demonstrated steady progress in multimer prediction while emphasizing the urgent need for more effective model ranking strategies, improved stoichiometry prediction, and the development of new modeling methods that extend beyond the current AF-based paradigm.

Matching journals

The top 4 journals account for 50% of the predicted probability mass.

1
Structure
175 papers in training set
Top 0.1%
17.8%
2
Proteins: Structure, Function, and Bioinformatics
82 papers in training set
Top 0.1%
16.7%
3
Protein Science
221 papers in training set
Top 0.1%
11.8%
4
Acta Crystallographica Section D Structural Biology
54 papers in training set
Top 0.1%
4.1%
50% of probability mass above
5
PLOS Computational Biology
1633 papers in training set
Top 9%
4.0%
6
Bioinformatics Advances
184 papers in training set
Top 1%
3.8%
7
Journal of Molecular Biology
217 papers in training set
Top 0.7%
3.4%
8
eLife
5422 papers in training set
Top 28%
3.4%
9
Nature Communications
4913 papers in training set
Top 41%
3.4%
10
Journal of Structural Biology
58 papers in training set
Top 0.5%
2.3%
11
Nucleic Acids Research
1128 papers in training set
Top 10%
1.8%
12
IUCrJ
29 papers in training set
Top 0.2%
1.8%
13
iScience
1063 papers in training set
Top 17%
1.6%
14
Communications Biology
886 papers in training set
Top 10%
1.6%
15
Journal of Chemical Information and Modeling
207 papers in training set
Top 2%
1.6%
16
Nature Methods
336 papers in training set
Top 5%
1.4%
17
Frontiers in Molecular Biosciences
100 papers in training set
Top 3%
1.3%
18
PLOS ONE
4510 papers in training set
Top 59%
1.3%
19
Cell Systems
167 papers in training set
Top 10%
1.1%
20
Scientific Reports
3102 papers in training set
Top 74%
0.8%
21
Computational and Structural Biotechnology Journal
216 papers in training set
Top 10%
0.7%
22
Life Science Alliance
263 papers in training set
Top 2%
0.7%
23
Biomolecules
95 papers in training set
Top 3%
0.7%
24
Briefings in Bioinformatics
326 papers in training set
Top 7%
0.7%
25
Chemical Science
71 papers in training set
Top 2%
0.7%
26
Proceedings of the National Academy of Sciences
2130 papers in training set
Top 48%
0.6%