Back

Pharmacological proximities in the GPCR family discovered using contact-informed amino-acid and binding pocket similarities

So, S. S.; Ngo, T.; Ilatovskiy, A. V.; Finch, A. M.; Riek, R. P.; Abagyan, R.; Smith, N. J.; Kufareva, I.

2026-05-06 bioinformatics
10.64898/2026.05.02.720972 bioRxiv
Show abstract

Understanding protein proximities in the theoretical ligand space is essential for developing therapeutics with desirable polypharmacology, predicting off-targets, and discovering surrogate ligands for poorly characterized proteins. This is especially important for G protein-coupled receptors (GPCRs) - a major class of drug targets, many of which still lack known ligands. Circumventing this limitation, we present GPCR-CoINPocket v2, a contact-informed metric for detecting GPCR pharmacological similarities from amino-acid sequences alone. We first establish a "gold standard" of pharmacological relatedness using ChEMBL-derived ligand sets. We then replace traditional evolutionary amino acid similarity matrices with a chemically-informed matrix derived from protein:ligand interaction patterns across 3,306 structures, significantly improving early detection of shared pharmacology between distantly homologous receptors. An additional unconstrained, contact-informed matrix further enhances predictive performance. Pilot application of the method revealed previously unrecognized similarities between the {beta}2 adrenoceptor and three Class A peptide GPCRs, which we confirmed experimentally by demonstrating the binding of select ligands of these receptors to the {beta}2. Dimensionality reduction of similarity scores recapitulates known receptor relationships and predicts neighbors of orphan GPCRs later confirmed experimentally. Overall, GPCR-CoINPocket v2 provides a powerful sequence-based framework to prioritize ligand space, predict polypharmacology, and accelerate GPCR drug discovery and deorphanization.

Matching journals

The top 6 journals account for 50% of the predicted probability mass.

1
Journal of Chemical Information and Modeling
207 papers in training set
Top 0.2%
22.0%
2
Bioinformatics
1061 papers in training set
Top 2%
12.4%
3
Cell Systems
167 papers in training set
Top 2%
6.2%
4
Proceedings of the National Academy of Sciences
2130 papers in training set
Top 18%
3.9%
5
Nature Communications
4913 papers in training set
Top 38%
3.9%
6
Journal of Cheminformatics
25 papers in training set
Top 0.1%
3.8%
50% of probability mass above
7
Nature Methods
336 papers in training set
Top 3%
3.5%
8
Bioinformatics Advances
184 papers in training set
Top 2%
2.7%
9
PLOS Computational Biology
1633 papers in training set
Top 13%
2.4%
10
Briefings in Bioinformatics
326 papers in training set
Top 3%
2.3%
11
Advanced Science
249 papers in training set
Top 9%
2.0%
12
Computational and Structural Biotechnology Journal
216 papers in training set
Top 3%
2.0%
13
Communications Biology
886 papers in training set
Top 6%
2.0%
14
Scientific Reports
3102 papers in training set
Top 54%
1.8%
15
Nucleic Acids Research
1128 papers in training set
Top 11%
1.7%
16
Nature Machine Intelligence
61 papers in training set
Top 2%
1.7%
17
Nature Biotechnology
147 papers in training set
Top 5%
1.7%
18
PLOS ONE
4510 papers in training set
Top 57%
1.5%
19
eLife
5422 papers in training set
Top 46%
1.5%
20
Chemical Science
71 papers in training set
Top 1%
1.3%
21
Patterns
70 papers in training set
Top 1%
1.3%
22
NAR Genomics and Bioinformatics
214 papers in training set
Top 3%
1.2%
23
Cell Reports Methods
141 papers in training set
Top 3%
1.2%
24
Genome Medicine
154 papers in training set
Top 6%
1.1%
25
BMC Bioinformatics
383 papers in training set
Top 6%
0.9%
26
Communications Chemistry
39 papers in training set
Top 1%
0.7%
27
Journal of Molecular Biology
217 papers in training set
Top 4%
0.7%