CrossAffinity: A Sequence-Based Protein-Protein Binding Affinity Prediction Tool Using Cross-Attention Mechanism
Guan, J. S.; Wang, Z.; Mu, Y.
Show abstract
Protein-protein binding affinity is important for understanding protein interactions within a protein complex and for identifying strong drug-peptide binders to a target protein. Many structure-based models were built previously with reasonable performance. However, such models require protein complex structure as input, which is usually unavailable due to high cost and experimental constraints. To tackle such an issue, the sequence-based CrossAffinity model was constructed in this study, using the cross-attention module to extract contextual information of interacting protein components while separating the protein complex into two distinct parts to predict the protein-protein binding affinity. CrossAffinity managed to outperform all structure-based models and sequence-based models in an S34 test set containing newer protein complex structures and binding affinity values in a timeline while being trained on an older dataset, showing generalisability to new data points. In other test sets, namely S90, S90 subset and S79*, CrossAffinity also managed to outperform all other sequence-based models while maintaining comparable performance to many recently published structure-based models. The acceptable performance and quick inference of CrossAffinity enable it to be deployed in situations requiring the prediction of the binding affinity of many protein complexes that lack structural information.
Matching journals
The top 5 journals account for 50% of the predicted probability mass.