Back

eSIG-Net: Accurate prediction of single-mutation induced perturbations on protein interactions using a language model

Pan, X.; Shrawat, A.; Raghavan, S.; Dong, C.; Yang, Y.; Li, Z.; Zheng, W. J.; Eckhardt, S. G.; Wu, E.; Fuxman Bass, J. I.; Jarosz, D. F.; Chen, S.; McGrail, D. J.; Sheynkman, G. M.; Huang, J. H.; Sahni, N.; Yi, S. S.

2026-03-31 bioinformatics
10.64898/2026.03.27.714913 bioRxiv
Show abstract

Most proteins exert their functions in complex with other interactors. Single mutations can exhibit a profound impact on perturbing protein interactions, leading to human disease. However, predicting the effect of single mutations on protein interactions remains a major computational challenge. Deep learning, particularly protein language models or transformers, has become an effective tool in bioinformatics for protein structure prediction. However, the functional divergence of mutations makes it difficult to predict their interaction perturbation profiles. To address this fundamental challenge, we present eSIG-Net (edgetic mutation Sequence-based Interaction Grammar Network), a novel sequence-based "Interaction Language Model" for predicting protein interaction alterations caused by single mutations. eSIG-Net combines various protein sequence embeddings, introduces a mutation-encoding module with syntax and evolutionary insights, and employs contrastive learning to evaluate mutation-induced interaction changes. eSIG-Net significantly outperforms current state-of-the-art sequence-based and structure-based prediction methods at predicting mutational impact on protein interactions. We highlight examples where eSIG-Net nominates causal variants with high confidence and elucidates their functional role under relevant biological contexts. Together, eSIG-Net is a first-in-kind "interaction language model" that can accurately predict interaction-specific rewiring by single mutations with only sequence information, and exhibits generalizability across biological contexts.

Matching journals

The top 4 journals account for 50% of the predicted probability mass.

1
Bioinformatics
1061 papers in training set
Top 0.7%
32.7%
2
Bioinformatics Advances
184 papers in training set
Top 0.1%
10.0%
3
Briefings in Bioinformatics
326 papers in training set
Top 1%
4.8%
4
Nature Communications
4913 papers in training set
Top 33%
4.8%
50% of probability mass above
5
Cell Systems
167 papers in training set
Top 3%
4.1%
6
PLOS Computational Biology
1633 papers in training set
Top 10%
3.6%
7
Nature Machine Intelligence
61 papers in training set
Top 1%
2.7%
8
Nucleic Acids Research
1128 papers in training set
Top 7%
2.7%
9
Genome Research
409 papers in training set
Top 1%
2.7%
10
BMC Bioinformatics
383 papers in training set
Top 4%
2.1%
11
NAR Genomics and Bioinformatics
214 papers in training set
Top 2%
1.9%
12
Journal of Molecular Biology
217 papers in training set
Top 1%
1.8%
13
Nature Methods
336 papers in training set
Top 4%
1.7%
14
eLife
5422 papers in training set
Top 42%
1.7%
15
Computational and Structural Biotechnology Journal
216 papers in training set
Top 5%
1.7%
16
Scientific Reports
3102 papers in training set
Top 62%
1.5%
17
Proceedings of the National Academy of Sciences
2130 papers in training set
Top 38%
1.2%
18
Patterns
70 papers in training set
Top 1%
1.2%
19
Advanced Science
249 papers in training set
Top 14%
1.2%
20
IEEE Transactions on Computational Biology and Bioinformatics
17 papers in training set
Top 0.4%
0.9%
21
Genome Biology
555 papers in training set
Top 6%
0.9%
22
Genome Medicine
154 papers in training set
Top 8%
0.7%
23
Frontiers in Genetics
197 papers in training set
Top 10%
0.7%
24
Genomics, Proteomics & Bioinformatics
171 papers in training set
Top 7%
0.7%
25
Journal of Chemical Information and Modeling
207 papers in training set
Top 3%
0.6%
26
Nature Computational Science
50 papers in training set
Top 2%
0.6%