Drug-Target Interaction Prediction with PIGLET

Carpenter, K. A.; Altman, R. B.

2026-02-18 bioinformatics

10.64898/2026.02.18.706530 bioRxiv

Show abstract

Drug-target interaction (DTI) prediction is a key task for computed-aided drug development that has been widely approached by deep learning models. Despite extremely high reported performance, these models have yet to find widespread success in accelerating real-world drug discovery. In contrast with the most common approach of creating embeddings from one-dimensional or three-dimensional representations of the input drug and input target, we create a novel graph transformer method for DTI prediction that operates on a proteome-wide knowledge graph of binding pocket similarity, protein-protein interactions, drug similarity, and known binding relationships. We benchmark our method, named PIGLET, against existing DTI prediction models on the Human dataset. We assess performance with two different splitting strategies: the frequently-reported random split, and a novel, more rigorous drug-based split. All models perform similarly well on the random split, and PIGLET outperforms all models on the drug-based split. We highlight the utility of PIGLET through a real-world drug discovery case study.

Matching journals

●Non-profit ◐University press ○Commercial

The top 7 journals account for 50% of the predicted probability mass.

Only show non-profit

◐ 1061 papers in training set

○ 167 papers in training set

○ 336 papers in training set

Journal of Chemical Information and Modeling

● 207 papers in training set

Bioinformatics Advances

◐ 184 papers in training set

Nature Communications

○ 4913 papers in training set

Journal of Cheminformatics

○ 25 papers in training set

50% of probability mass above

Proceedings of the National Academy of Sciences

● 2130 papers in training set

Briefings in Bioinformatics

◐ 326 papers in training set

Nature Machine Intelligence

○ 61 papers in training set

Scientific Reports

○ 3102 papers in training set

Nucleic Acids Research

◐ 1128 papers in training set

Advanced Science

○ 249 papers in training set

PLOS Computational Biology

● 1633 papers in training set

Nature Biotechnology

○ 147 papers in training set

Communications Biology

○ 886 papers in training set

IEEE Transactions on Computational Biology and Bioinformatics

● 17 papers in training set

● 5422 papers in training set

○ 70 papers in training set

● 4510 papers in training set

BMC Bioinformatics

○ 383 papers in training set

Genome Medicine

○ 154 papers in training set

Chemical Science

● 71 papers in training set

NAR Genomics and Bioinformatics

◐ 214 papers in training set

○ 1063 papers in training set

Computational and Structural Biotechnology Journal

● 216 papers in training set

Genome Research

● 409 papers in training set

Nature Computational Science

○ 50 papers in training set

Nature Biomedical Engineering

○ 42 papers in training set

Biophysical Journal

○ 545 papers in training set