Back

Graphsite: ligand-binding site classification using deep graph neural network

Shi, W.; Singha, M.; Pu, L.; Ramanujam, J. R.; Brylinski, M.

2021-12-07 bioinformatics
10.1101/2021.12.06.471420 bioRxiv
Show abstract

Binding sites are concave surfaces on proteins that bind to small molecules called ligands. Types of molecules that bind to the protein determine its biological function. Meanwhile, the binding process between small molecules and the protein is also crucial to various biological functionalities. Therefore, identifying and classifying such binding sites would enormously contribute to biomedical applications such as drug repurposing. Deep learning is a modern artificial intelligence technology. It utilizes deep neural networks to handle complex tasks such as image classification and language translation. Previous work has proven the capability of deep learning models handle binding sites wherein the binding sites are represented as pixels or voxels. Graph neural networks (GNNs) are deep learning models that operate on graphs. GNNs are promising for handling binding sites related tasks - provided there is an adequate graph representation to model the binding sties. In this communication, we describe a GNN-based computational method, GraphSite, that utilizes a novel graph representation of ligand-binding sites. A state-of-the-art GNN model is trained to capture the intrinsic characteristics of these binding sites and classify them. Our model generalizes well to unseen data and achieves test accuracy of 81.28% on classifying 14 binding site classes.

Matching journals

The top 8 journals account for 50% of the predicted probability mass.

1
Bioinformatics
1061 papers in training set
Top 2%
12.5%
2
IEEE/ACM Transactions on Computational Biology and Bioinformatics
32 papers in training set
Top 0.1%
10.2%
3
PLOS ONE
4510 papers in training set
Top 22%
8.5%
4
Scientific Reports
3102 papers in training set
Top 14%
6.9%
5
IEEE Journal of Biomedical and Health Informatics
34 papers in training set
Top 0.3%
4.3%
6
BMC Bioinformatics
383 papers in training set
Top 2%
4.0%
7
Briefings in Bioinformatics
326 papers in training set
Top 2%
3.1%
8
PLOS Computational Biology
1633 papers in training set
Top 11%
3.1%
50% of probability mass above
9
Neurocomputing
13 papers in training set
Top 0.1%
2.8%
10
Computational and Structural Biotechnology Journal
216 papers in training set
Top 2%
2.6%
11
IEEE Transactions on Computational Biology and Bioinformatics
17 papers in training set
Top 0.1%
2.6%
12
Journal of Chemical Information and Modeling
207 papers in training set
Top 2%
2.1%
13
Frontiers in Genetics
197 papers in training set
Top 4%
1.9%
14
Frontiers in Computational Neuroscience
53 papers in training set
Top 1%
1.9%
15
Journal of Computational Biology
37 papers in training set
Top 0.1%
1.8%
16
Bioengineering
24 papers in training set
Top 0.4%
1.7%
17
iScience
1063 papers in training set
Top 24%
1.0%
18
IEEE Access
31 papers in training set
Top 0.9%
0.8%
19
Computers in Biology and Medicine
120 papers in training set
Top 4%
0.8%
20
Neuroinformatics
40 papers in training set
Top 1.0%
0.8%
21
Frontiers in Bioinformatics
45 papers in training set
Top 0.9%
0.8%
22
Biomolecules
95 papers in training set
Top 2%
0.7%
23
Communications Biology
886 papers in training set
Top 26%
0.7%
24
Molecules
37 papers in training set
Top 2%
0.7%
25
Frontiers in Molecular Biosciences
100 papers in training set
Top 6%
0.7%
26
Computational Biology and Chemistry
23 papers in training set
Top 0.7%
0.7%
27
Journal of Pathology Informatics
13 papers in training set
Top 0.4%
0.7%
28
Bioinformatics Advances
184 papers in training set
Top 5%
0.7%
29
Expert Systems with Applications
11 papers in training set
Top 0.6%
0.7%
30
Quantitative Biology
11 papers in training set
Top 0.9%
0.7%