Back

Unified sampling framework and experimental benchmarking of sequence- and structure-based protein models

Spinner, A.; Notin, P.; Berry, S.; Cortade, D.; Sisson, Z.; Ikonomova, S.; Ross, D.; Marks, D.

2026-05-12 bioinformatics
10.64898/2026.05.08.723784 bioRxiv
Show abstract

Generative models are increasingly used for protein design, but the lack of standardized evaluation frameworks limits comparison across model classes and hinders translation to experimental success. Here, we introduce a unified sampling and benchmarking framework that enables controlled sequence generation across alignment, protein language, and structure-based models, and apply it to Tobacco etch virus (TEV) protease. Across hundreds of thousands of designed sequences, different models explore distinct regions of sequence space with no clear computational selection metrics to assess enzymatic function. Experimental evaluation reveals large differences in functional outcomes, ranging from non-functional variants to sequences with 9-fold higher activity than wildtype. Machine learning-designed libraries achieve a 39.32% hit rate (percentage of variants matching or exceeding wildtype activity) compared to 6.06% for an error-prone PCR baseline. Structure-based models perform best overall, with hit rates of 74.4% and 66.8% for ESM-IF1 and ProteinMPNN, respectively. Commonly used selection metrics do not strongly correlate with experimental activity, highlighting a gap between in silico evaluation and enzyme function. Together, these results establish a generalizable framework for benchmarking generative protein models and demonstrate the necessity of experimental validation for guiding model development and sequence prioritization.

Matching journals

The top 7 journals account for 50% of the predicted probability mass.

1
Nature Communications
4913 papers in training set
Top 14%
12.4%
2
Cell Systems
167 papers in training set
Top 1%
10.0%
3
Journal of Chemical Information and Modeling
207 papers in training set
Top 0.6%
10.0%
4
Nature Biotechnology
147 papers in training set
Top 1%
6.3%
5
Nucleic Acids Research
1128 papers in training set
Top 4%
4.8%
6
Computational and Structural Biotechnology Journal
216 papers in training set
Top 0.9%
4.8%
7
Nature Machine Intelligence
61 papers in training set
Top 1.0%
3.6%
50% of probability mass above
8
PLOS Computational Biology
1633 papers in training set
Top 10%
3.6%
9
Briefings in Bioinformatics
326 papers in training set
Top 2%
2.7%
10
mAbs
28 papers in training set
Top 0.1%
2.1%
11
Chemical Science
71 papers in training set
Top 0.8%
1.9%
12
Advanced Science
249 papers in training set
Top 10%
1.9%
13
Cell Reports Methods
141 papers in training set
Top 2%
1.8%
14
Proceedings of the National Academy of Sciences
2130 papers in training set
Top 31%
1.8%
15
Nature Methods
336 papers in training set
Top 4%
1.8%
16
Protein Science
221 papers in training set
Top 0.9%
1.7%
17
Scientific Reports
3102 papers in training set
Top 58%
1.7%
18
Journal of Cheminformatics
25 papers in training set
Top 0.3%
1.7%
19
Bioinformatics
1061 papers in training set
Top 7%
1.7%
20
ACS Synthetic Biology
256 papers in training set
Top 2%
1.5%
21
Communications Biology
886 papers in training set
Top 11%
1.5%
22
Journal of Chemical Theory and Computation
126 papers in training set
Top 0.7%
1.2%
23
Journal of Molecular Biology
217 papers in training set
Top 2%
1.2%
24
PLOS ONE
4510 papers in training set
Top 60%
1.2%
25
Proteins: Structure, Function, and Bioinformatics
82 papers in training set
Top 0.7%
0.9%
26
Science
429 papers in training set
Top 19%
0.8%
27
NAR Genomics and Bioinformatics
214 papers in training set
Top 3%
0.8%
28
Bioinformatics Advances
184 papers in training set
Top 5%
0.7%
29
International Journal of Molecular Sciences
453 papers in training set
Top 16%
0.7%
30
Genome Medicine
154 papers in training set
Top 9%
0.6%