Back

Improved prediction of virus-human protein-protein interactions by incorporating network topology and viral molecular mimicry

Zhang, Z.; Feng, Y.; Meng, X.; Peng, Y.

2026-03-03 bioinformatics
10.64898/2026.02.28.708776 bioRxiv
Show abstract

The protein-protein interactions (PPIs) between viruses and human play crucial roles in viral infections. Although numerous computational approaches have been proposed for predicting virus-human PPIs, their performances remain suboptimal and may be overestimated due to the lack of benchmark dataset. To address these limitations, we first constructed a carefully curated benchmark dataset, ensuring non-overlapped PPIs and minimum sequences similarity of both human and viral proteins in the training and test sets. Based on this dataset, we developed vhPPIpred, a machine learning-based prediction method that not only incorporated sequence embedding and evolutionary information but also leveraged network topology and viral molecular mimicry of human PPIs. Comparative experiments demonstrated that vhPPIpred outperformed five state-of-the-art methods on both our benchmark dataset and three independent datasets. vhPPIpred also achieved high computational efficiency, requiring relatively low runtime and memory. Finally, vhPPIpred was demonstrated to have great potential in identifying human virus receptors, and in inferring virus phenotypes as the virus-human PPIs predicted by vhPPIpred can be used to effectively infer virus virulence. In summary, this study provides a valuable benchmark dataset and an effective tool for virus-human PPI prediction, with potential applications in antiviral drug discovery, host-pathogen interaction research and early warnings of emerging viruses.

Matching journals

The top 4 journals account for 50% of the predicted probability mass.

1
Briefings in Bioinformatics
326 papers in training set
Top 0.1%
22.8%
2
PLOS Computational Biology
1633 papers in training set
Top 2%
12.6%
3
Journal of Chemical Information and Modeling
207 papers in training set
Top 0.5%
10.2%
4
Computers in Biology and Medicine
120 papers in training set
Top 0.3%
6.4%
50% of probability mass above
5
Computational and Structural Biotechnology Journal
216 papers in training set
Top 0.5%
6.4%
6
Genomics, Proteomics & Bioinformatics
171 papers in training set
Top 2%
3.6%
7
Bioinformatics
1061 papers in training set
Top 6%
2.8%
8
BMC Bioinformatics
383 papers in training set
Top 4%
2.1%
9
Scientific Reports
3102 papers in training set
Top 50%
2.1%
10
Advanced Science
249 papers in training set
Top 11%
1.7%
11
Frontiers in Immunology
586 papers in training set
Top 4%
1.7%
12
PLOS ONE
4510 papers in training set
Top 60%
1.2%
13
Frontiers in Genetics
197 papers in training set
Top 8%
0.9%
14
Communications Biology
886 papers in training set
Top 18%
0.9%
15
International Journal of Molecular Sciences
453 papers in training set
Top 15%
0.8%
16
Frontiers in Molecular Biosciences
100 papers in training set
Top 5%
0.8%
17
Virologica Sinica
10 papers in training set
Top 0.4%
0.8%
18
Science Bulletin
22 papers in training set
Top 0.8%
0.8%
19
Frontiers in Bioinformatics
45 papers in training set
Top 0.9%
0.8%
20
Quantitative Biology
11 papers in training set
Top 0.8%
0.7%
21
Frontiers in Microbiology
375 papers in training set
Top 9%
0.7%
22
IEEE Journal of Biomedical and Health Informatics
34 papers in training set
Top 2%
0.7%
23
IEEE/ACM Transactions on Computational Biology and Bioinformatics
32 papers in training set
Top 0.7%
0.7%
24
Patterns
70 papers in training set
Top 3%
0.7%
25
Journal of Genetics and Genomics
36 papers in training set
Top 3%
0.5%
26
Viruses
318 papers in training set
Top 7%
0.5%
27
iScience
1063 papers in training set
Top 40%
0.5%
28
Bioinformatics Advances
184 papers in training set
Top 6%
0.5%