Back

CrossAffinity: A Sequence-Based Protein-Protein Binding Affinity Prediction Tool Using Cross-Attention Mechanism

Guan, J. S.; Wang, Z.; Mu, Y.

2026-02-23 bioinformatics
10.64898/2026.02.22.707318 bioRxiv
Show abstract

Protein-protein binding affinity is important for understanding protein interactions within a protein complex and for identifying strong drug-peptide binders to a target protein. Many structure-based models were built previously with reasonable performance. However, such models require protein complex structure as input, which is usually unavailable due to high cost and experimental constraints. To tackle such an issue, the sequence-based CrossAffinity model was constructed in this study, using the cross-attention module to extract contextual information of interacting protein components while separating the protein complex into two distinct parts to predict the protein-protein binding affinity. CrossAffinity managed to outperform all structure-based models and sequence-based models in an S34 test set containing newer protein complex structures and binding affinity values in a timeline while being trained on an older dataset, showing generalisability to new data points. In other test sets, namely S90, S90 subset and S79*, CrossAffinity also managed to outperform all other sequence-based models while maintaining comparable performance to many recently published structure-based models. The acceptable performance and quick inference of CrossAffinity enable it to be deployed in situations requiring the prediction of the binding affinity of many protein complexes that lack structural information.

Matching journals

The top 5 journals account for 50% of the predicted probability mass.

1
Briefings in Bioinformatics
326 papers in training set
Top 0.1%
18.4%
2
Journal of Chemical Information and Modeling
207 papers in training set
Top 0.3%
18.4%
3
Bioinformatics
1061 papers in training set
Top 4%
6.7%
4
Journal of Cheminformatics
25 papers in training set
Top 0.1%
6.2%
5
Computational and Structural Biotechnology Journal
216 papers in training set
Top 0.9%
4.8%
50% of probability mass above
6
BMC Bioinformatics
383 papers in training set
Top 2%
4.3%
7
Computers in Biology and Medicine
120 papers in training set
Top 0.7%
3.9%
8
Proteins: Structure, Function, and Bioinformatics
82 papers in training set
Top 0.2%
3.5%
9
PLOS ONE
4510 papers in training set
Top 44%
2.7%
10
Journal of Molecular Biology
217 papers in training set
Top 0.9%
2.6%
11
PLOS Computational Biology
1633 papers in training set
Top 13%
2.3%
12
Scientific Reports
3102 papers in training set
Top 54%
1.9%
13
Bioinformatics Advances
184 papers in training set
Top 3%
1.7%
14
Computational Biology and Chemistry
23 papers in training set
Top 0.2%
1.3%
15
IEEE/ACM Transactions on Computational Biology and Bioinformatics
32 papers in training set
Top 0.3%
1.2%
16
Biomolecules
95 papers in training set
Top 2%
0.9%
17
Communications Biology
886 papers in training set
Top 19%
0.9%
18
Molecules
37 papers in training set
Top 2%
0.9%
19
International Journal of Molecular Sciences
453 papers in training set
Top 14%
0.8%
20
Communications Chemistry
39 papers in training set
Top 1.0%
0.8%
21
Nature Communications
4913 papers in training set
Top 61%
0.8%
22
Journal of Proteome Research
215 papers in training set
Top 2%
0.7%
23
ACS Omega
90 papers in training set
Top 4%
0.7%
24
Artificial Intelligence in the Life Sciences
11 papers in training set
Top 0.3%
0.7%
25
Genomics, Proteomics & Bioinformatics
171 papers in training set
Top 7%
0.7%
26
Frontiers in Bioinformatics
45 papers in training set
Top 1%
0.7%
27
Nucleic Acids Research
1128 papers in training set
Top 20%
0.6%
28
International Journal of Biological Macromolecules
65 papers in training set
Top 4%
0.6%