Back

Sequence-Driven Drug-Target Affinity Prediction Via Graph Attention Networks and Bidirectional Cross-Attention Fusion

Kudari, Z.; Kaira, V. S.; P, S. S.; Bhat, R.; Gnana Sekaran, J.

2026-04-06 bioinformatics
10.64898/2026.04.03.716294 bioRxiv
Show abstract

Accurate prediction of drug-target affinity (DTA) is a core challenge in computational drug discovery. Structure-based methods depend on experimentally determined protein coordinates, which are unavailable for most drug-relevant targets. sequence-only approaches, in turn, operate on linear residue representations and lack an explicit mechanism to encode the spatial proximity relationships that govern protein-ligand interactions. We present XAttn-DTA, a sequence-driven framework that addresses both limitations without requiring experimental structural data. Drug molecules are encoded as 2D molecular graphs via multilayer Graph Attention Networks (GATs), capturing atomic topology and bond-level chemistry. Proteins are represented as residue-level graphs constructed from ESM2-predicted contact maps, that captures inter-residue coevolutionary and structural signals embedded within the sequence. The bidirectional cross-attention fusion module projects both embeddings into a shared latent space and applies dual multi-head cross-attention. This enables ligand and protein residue environments to inform one another. On the Davis benchmark, XAttn-DTA achieves a concordance index (CI) of 0.907 and MSE of 0.175, improving CI by 1.8% and reducing MSE by 9.3% over the strongest baseline. On KIBA, it achieves an MSE of 0.121, a 13.6% reduction. Under three strict cold-start settings across Davis, KIBA, and BindingDB, the model yields MSE reductions of up to 79.0% and CI improvements of up to 31.5% over the strongest baseline, demonstrating strong generalization to unseen scaffolds and novel protein families.

Matching journals

The top 7 journals account for 50% of the predicted probability mass.

1
Bioinformatics
1061 papers in training set
Top 2%
18.2%
2
Nature Communications
4913 papers in training set
Top 19%
9.9%
3
Cell Systems
167 papers in training set
Top 2%
6.2%
4
Nature Methods
336 papers in training set
Top 2%
6.2%
5
Nature Biotechnology
147 papers in training set
Top 2%
4.7%
6
Advanced Science
249 papers in training set
Top 5%
3.9%
7
Journal of Chemical Information and Modeling
207 papers in training set
Top 1%
3.5%
50% of probability mass above
8
Nature Machine Intelligence
61 papers in training set
Top 1%
3.5%
9
Nucleic Acids Research
1128 papers in training set
Top 7%
3.2%
10
Bioinformatics Advances
184 papers in training set
Top 2%
3.0%
11
Proceedings of the National Academy of Sciences
2130 papers in training set
Top 24%
2.8%
12
Briefings in Bioinformatics
326 papers in training set
Top 2%
2.8%
13
Communications Biology
886 papers in training set
Top 7%
1.8%
14
PLOS Computational Biology
1633 papers in training set
Top 15%
1.7%
15
Scientific Reports
3102 papers in training set
Top 56%
1.7%
16
Nature Computational Science
50 papers in training set
Top 0.8%
1.5%
17
Genome Medicine
154 papers in training set
Top 6%
1.3%
18
PLOS ONE
4510 papers in training set
Top 59%
1.3%
19
Science
429 papers in training set
Top 16%
1.3%
20
Journal of Cheminformatics
25 papers in training set
Top 0.4%
1.2%
21
Patterns
70 papers in training set
Top 2%
1.2%
22
Genome Biology
555 papers in training set
Top 6%
1.2%
23
eLife
5422 papers in training set
Top 51%
1.1%
24
iScience
1063 papers in training set
Top 27%
0.9%
25
Computational and Structural Biotechnology Journal
216 papers in training set
Top 9%
0.8%
26
Nature Genetics
240 papers in training set
Top 8%
0.7%
27
Nature Biomedical Engineering
42 papers in training set
Top 2%
0.7%
28
Nature
575 papers in training set
Top 16%
0.7%
29
BMC Bioinformatics
383 papers in training set
Top 7%
0.7%
30
Genome Research
409 papers in training set
Top 5%
0.6%