Back

FLIP2: Expanding Protein Fitness Landscape Benchmarks for Real-World Machine Learning Applications

Didi, K.; Alamdari, S.; Lu, A. X.; Wittmann, B.; Johnston, K. E.; Amini, A. P.; Madani, A. K.; Czeneszew, M.; Dallago, C.; Yang, K. K.

2026-02-24 bioengineering
10.64898/2026.02.23.707496 bioRxiv
Show abstract

Machine learning methods that predict protein fitness from sequence remain sensitive to changes in data distributions, limiting generalization across common conditions encountered in protein engineering. Practically, protein engineers are thus left wondering about the effective utility of ML tools. The FLIP benchmark established protocols for testing generalization under some domain shifts, but it was limited to measurements of thermostability, binding, and viral capsid viability. We introduce FLIP2, a protein fitness benchmark spanning seven new datasets, including enzymes, protein-protein interactions, and light-sensitive proteins, as well as splits that measure generalization relevant to real-world protein engineering campaigns. Evaluating a suite of benchmark models across these datasets and suites reveals that simpler models often matched or outperformed fine-tuned protein language models on FLIP2, challenging the utility of existing transfer learning techniques. Provenance for all datasets has been recorded and we redistribute all data CC-BY 4.0 to facilitate continued progress.

Matching journals

The top 5 journals account for 50% of the predicted probability mass.

1
Cell Systems
167 papers in training set
Top 0.2%
22.9%
2
Nature Methods
336 papers in training set
Top 0.9%
10.3%
3
Nature Communications
4913 papers in training set
Top 17%
10.3%
4
Protein Engineering, Design and Selection
14 papers in training set
Top 0.1%
4.9%
5
Proceedings of the National Academy of Sciences
2130 papers in training set
Top 13%
4.9%
50% of probability mass above
6
Nature
575 papers in training set
Top 6%
4.4%
7
Nature Biotechnology
147 papers in training set
Top 2%
4.4%
8
Nature Medicine
117 papers in training set
Top 0.6%
4.0%
9
Nature Machine Intelligence
61 papers in training set
Top 0.8%
3.7%
10
Nucleic Acids Research
1128 papers in training set
Top 7%
2.8%
11
Protein Science
221 papers in training set
Top 0.6%
2.1%
12
Scientific Reports
3102 papers in training set
Top 57%
1.7%
13
PLOS Computational Biology
1633 papers in training set
Top 16%
1.7%
14
Science
429 papers in training set
Top 14%
1.7%
15
PLOS ONE
4510 papers in training set
Top 56%
1.5%
16
Briefings in Bioinformatics
326 papers in training set
Top 5%
1.4%
17
eLife
5422 papers in training set
Top 48%
1.2%
18
ACS Synthetic Biology
256 papers in training set
Top 2%
0.9%
19
Proteins: Structure, Function, and Bioinformatics
82 papers in training set
Top 0.8%
0.9%
20
Bioinformatics
1061 papers in training set
Top 9%
0.9%
21
Nature Genetics
240 papers in training set
Top 8%
0.7%
22
Genome Biology
555 papers in training set
Top 8%
0.7%
23
Cell Genomics
162 papers in training set
Top 7%
0.7%
24
Structure
175 papers in training set
Top 4%
0.5%