Benchmarking and Experimental Validation of Machine Learning Strategies for Enzyme Engineering

Zeng, Z.; Jin, J.; Xu, R.; Luo, X.

2026-03-30 bioengineering

10.64898/2026.03.29.715152 bioRxiv

Show abstract

Enzyme-directed evolution increasingly relies on computational tools to prioritize mutations, yet their practical value is difficult to assess because kinetic data are often aggregated across heterogeneous assay conditions, inflating apparent generalization. Here we introduce EnzyArena, a curated benchmark that groups kinetic parameters (kcat, Km, kcat/Km) into condition-matched experimental subsets to enable realistic evaluation. Using this resource, we benchmark 10 representative models from two arising strategy families--zero-shot fitness prediction and supervised kinetic-parameter prediction--across BRENDA- and SABIO-RK-derived subsets and 25 independent mutagenesis datasets. Kinetic-parameter predictors perform strongly on database-derived subsets but lose their advantage on independent datasets, whereas zero-shot predictors show more consistent generalization. A simple consensus of multiple zero-shot models further improves the precision of identifying beneficial mutants. We prospectively validated these findings in a wet-lab campaign (150 mutants) comparing random mutants, UniKP-prioritized mutants and ESM-1v-prioritized mutants (representing supervised kinetic-parameter prediction and zero-shot fitness prediction, respectively), where ESM-1v achieved the highest utility and UniKP underperformed the random baseline. Together, this study establishes realistic baselines for computational mutant prioritization and highlights consensus zero-shot strategies as a practical starting point for enzyme engineering.

Benchmarking and Experimental Validation of Machine Learning Strategies for Enzyme Engineering

Matching journals