Back

Effects of protein interface mutations on protein quality and affinity

de Kanter, J. K.; Smorodina, E.; Minnegalieva, A.; Arts, M.; Blaabjerg, L. M.; Frolenkova, M.; Rawat, P.; Wolfram, L.; Britze, H.; Wilke, Y.; Weissenborn, L.; Lindenburg, L.; Engelhart, E.; McGowan, K. L.; Emerson, R.; Lopez, R.; van Bemmel, J. G.; Demharter, S.; Spreafico, R.; Greiff, V.

2026-03-26 molecular biology
10.64898/2026.03.24.713863 bioRxiv
Show abstract

Accurately modeling antibody-antigen interactions requires distinguishing intrinsic binding affinity ("protein-interaction") from protein biophysical properties ("protein-quality"), including folding, stability, and expression. However, high-throughput mutational measurements commonly used to train and benchmark computational models often conflate these effects, obscuring the true determinants of molecular recognition. Here, we present an experimental and analytical framework to disentangle protein-interaction effects from protein-quality effects in single-domain antibody (VHH)-antigen binding. Using a large-scale deep mutational scanning (DMS) dataset spanning four VHH-antigen complexes, with single and double mutations in both partners, we introduce control binders to quantify protein-quality changes independently of protein-interaction. This enables decomposition of experimentally measured affinity into protein-interaction and protein-quality components at scale. Leveraging the disentangled dataset, we evaluated state-of-the-art structure- and sequence-based models for protein-quality and protein-interaction prediction and show that their performance largely reflects protein-quality rather than protein-interaction effects. Our results highlight a major confounder in current datasets and suggest that accounting for protein-quality will be essential for training next-generation affinity-prediction models. Nomenclature Antibody related termsO_LIPrimary VHH: The VHH of a VHH-antigen complex for which the paratope and the epitope weremutated. C_LIO_LIControl VHH: A second VHH that binds to the same antigen as the primary VHH but has non-overlapping epitope positions and therefore does not bind to any of the mutated antigen positions. C_LI Affinity-related termsO_LIReal Affinity: "The strength of the interaction between two [...] molecules that bind reversibly (interact)" 1. In the context of antibody-antigen binding, it quantifies interactions between active proteins (which are expressed and correctly folded 2 and are therefore functionally and biologically active (see below). It is commonly quantified by the equilibrium dissociation constant, KD. C_LIO_LIObserved affinity ({degrees}KD): The interaction strength experimentally measured between two molecules. Unlike real affinity, this value is confounded by the biophysical properties of the individual binding partners, specifically their folding, stability, and expression levels. Consequently, the observed affinity often differs from the real/intrinsic affinity if a significant fraction of the protein population is inactive 3. NOTE: Unless otherwise specified, {degrees}KD is reported in - log10 space. For example, a {degrees}KD of -9 corresponds to 10-9M or 1nM. C_LIO_LIChange in observed affinity ({Delta}{degrees}KD): The shift in the observed affinity between two proteins upon mutation, reported as the log10-transformed fold change. A value of 1 reflects a 10-fold difference, a value of 2 a 100-fold difference, etc. This aggregate change resolves into two distinct biophysical components 2, 4: O_LIProtein-interaction change: The change in the intrinsic thermodynamic affinity between the two binding partners, each in its active state (i.e., the specific change in interface Gibbs free energy because both enthalpy and entropy are considered). C_LIO_LIProtein-quality change: The change in the fraction of the mutated protein population that is biologically active - meaning it is expressed, correctly folded, and stable 2, 5. O_LIFolding: The process that guides the polypeptide chain toward its native conformation, which is a prerequisite for forming a functional binding site. C_LIO_LIStability: The thermodynamic capacity to maintain the folded structure over time and under physiological conditions. Stability (decrease in Gibbs free energy from the unfolded to the folded state) ensures the binding interface remains intact and prevents competing processes such as aggregation 6. C_LIO_LIExpression: The steady-state abundance of the protein. This is largely dependent on proper folding and stability, as cellular quality control mechanisms degrade proteins that fail to fold or remain stable at functional concentrations. C_LI C_LI C_LIO_LIChange in relative affinity ({Delta}{Delta}{degrees}KD): the difference between the {Delta}{degrees}KD of the primary VHH compared to the control VHH for a given epitope mutation. C_LI Model-related termsO_LIESM-IF1 sc: Single-chain (sc) structure-conditioned inverse folding model (ESM-IF1), using the isolated monomer structure of the mutated protein: either the VHH or the antigen 7. C_LIO_LIESM-IF1 mc: Multi-chain (mc) structure-conditioned model (ESM-IF1), using the full complex structure (both antibody and antigen) 7. C_LIO_LIStability prediction score: Score that represents the predicted change in stability based on a single mutation, normally represented as {Delta}{Delta}G. C_LI

Matching journals

The top 8 journals account for 50% of the predicted probability mass.

1
Bioinformatics
1061 papers in training set
Top 3%
10.2%
2
Protein Science
221 papers in training set
Top 0.1%
8.5%
3
mAbs
28 papers in training set
Top 0.1%
7.2%
4
Scientific Reports
3102 papers in training set
Top 14%
6.9%
5
Nature Communications
4913 papers in training set
Top 28%
6.4%
6
Communications Biology
886 papers in training set
Top 2%
3.6%
7
Computational and Structural Biotechnology Journal
216 papers in training set
Top 2%
3.6%
8
Antibody Therapeutics
16 papers in training set
Top 0.1%
3.6%
50% of probability mass above
9
Viruses
318 papers in training set
Top 2%
3.1%
10
eLife
5422 papers in training set
Top 32%
2.6%
11
Journal of Molecular Biology
217 papers in training set
Top 0.9%
2.6%
12
Nucleic Acids Research
1128 papers in training set
Top 9%
1.9%
13
International Journal of Molecular Sciences
453 papers in training set
Top 6%
1.9%
14
Frontiers in Immunology
586 papers in training set
Top 4%
1.8%
15
Briefings in Bioinformatics
326 papers in training set
Top 4%
1.8%
16
PLOS ONE
4510 papers in training set
Top 53%
1.7%
17
Biomolecules
95 papers in training set
Top 0.5%
1.7%
18
FEBS Open Bio
29 papers in training set
Top 0.2%
1.5%
19
PLOS Computational Biology
1633 papers in training set
Top 18%
1.3%
20
ImmunoInformatics
11 papers in training set
Top 0.1%
1.3%
21
Structure
175 papers in training set
Top 2%
1.2%
22
BMC Bioinformatics
383 papers in training set
Top 5%
1.2%
23
Frontiers in Molecular Biosciences
100 papers in training set
Top 3%
1.2%
24
Journal of Structural Biology
58 papers in training set
Top 1%
0.9%
25
Nature Methods
336 papers in training set
Top 5%
0.9%
26
iScience
1063 papers in training set
Top 26%
0.9%
27
Frontiers in Bioinformatics
45 papers in training set
Top 0.9%
0.8%
28
FEBS Letters
42 papers in training set
Top 0.3%
0.8%
29
Proceedings of the National Academy of Sciences
2130 papers in training set
Top 44%
0.8%
30
Computers in Biology and Medicine
120 papers in training set
Top 5%
0.8%