eSIG-Net: Accurate prediction of single-mutation induced perturbations on protein interactions using a language model
Pan, X.; Shrawat, A.; Raghavan, S.; Dong, C.; Yang, Y.; Li, Z.; Zheng, W. J.; Eckhardt, S. G.; Wu, E.; Fuxman Bass, J. I.; Jarosz, D. F.; Chen, S.; McGrail, D. J.; Sheynkman, G. M.; Huang, J. H.; Sahni, N.; Yi, S. S.
Show abstract
Most proteins exert their functions in complex with other interactors. Single mutations can exhibit a profound impact on perturbing protein interactions, leading to human disease. However, predicting the effect of single mutations on protein interactions remains a major computational challenge. Deep learning, particularly protein language models or transformers, has become an effective tool in bioinformatics for protein structure prediction. However, the functional divergence of mutations makes it difficult to predict their interaction perturbation profiles. To address this fundamental challenge, we present eSIG-Net (edgetic mutation Sequence-based Interaction Grammar Network), a novel sequence-based "Interaction Language Model" for predicting protein interaction alterations caused by single mutations. eSIG-Net combines various protein sequence embeddings, introduces a mutation-encoding module with syntax and evolutionary insights, and employs contrastive learning to evaluate mutation-induced interaction changes. eSIG-Net significantly outperforms current state-of-the-art sequence-based and structure-based prediction methods at predicting mutational impact on protein interactions. We highlight examples where eSIG-Net nominates causal variants with high confidence and elucidates their functional role under relevant biological contexts. Together, eSIG-Net is a first-in-kind "interaction language model" that can accurately predict interaction-specific rewiring by single mutations with only sequence information, and exhibits generalizability across biological contexts.
Matching journals
The top 4 journals account for 50% of the predicted probability mass.