Back

Transfer Learning Enables Drug-Target Interaction Prediction in Data-Scarce One-Carbon Metabolism

Dalkiran, A.; Cho, T.; Atalay, M. V.; Shin, K. W. D.; Meliton, A. Y.; Woods, P. S.; Shamaa, O. R.; Hamanaka, R. B.; Mutlu, G. M.; Cetin-Atalay, R.

2026-05-05 bioinformatics
10.64898/2026.04.30.721937 bioRxiv
Show abstract

Predicting drug-target interactions (DTIs) with deep learning offers opportunities to accelerate drug discovery, yet performance is constrained by the scarcity of target-specific training data. This is a particular challenge for mitochondrial one-carbon (1C) pathway enzymes, which are attractive therapeutic targets but remain pharmacologically understudied. Mitochondrial 1C metabolism supplies glycine, reducing equivalents, and 1C units critical for nucleotide synthesis, and has emerged as a key pathway in cancer and fibrosis. SHMT2 and MTHFD2, two key 1C enzymes, support collagen production in fibroblasts, blocking either prevents TGF-{beta}-induced glycine and collagen accumulation. Here, we developed transfer learning-based deep learning models to predict interactions between approved drugs and SHMT2 or MTHFD2 despite minimal target-specific training data, pre-training on large datasets from related enzymes before fine-tuning to these targets. Virtual screening of the DrugBank library identified six candidates, three of which, Carbimazole, Crizotinib, and GSK2018682 reduced TGF-{beta}-induced collagen production and glycine accumulation in human lung fibroblasts, demonstrating transfer learning as a strategy for repurposable drug identification in data-scarce metabolic targets.

Matching journals

The top 7 journals account for 50% of the predicted probability mass.

1
Advanced Science
249 papers in training set
Top 0.9%
12.0%
2
Nature Chemical Biology
104 papers in training set
Top 0.1%
9.8%
3
Nature Communications
4913 papers in training set
Top 20%
9.8%
4
Cell Systems
167 papers in training set
Top 2%
6.2%
5
Cell Chemical Biology
81 papers in training set
Top 0.6%
4.2%
6
Nature Machine Intelligence
61 papers in training set
Top 0.7%
4.2%
7
Science Advances
1098 papers in training set
Top 4%
3.9%
50% of probability mass above
8
Nature Biotechnology
147 papers in training set
Top 3%
3.5%
9
Science
429 papers in training set
Top 10%
3.5%
10
Proceedings of the National Academy of Sciences
2130 papers in training set
Top 22%
3.5%
11
Nature
575 papers in training set
Top 8%
3.0%
12
Nature Metabolism
56 papers in training set
Top 0.8%
2.7%
13
Cell Genomics
162 papers in training set
Top 3%
2.0%
14
Cell Reports
1338 papers in training set
Top 23%
1.8%
15
Nature Methods
336 papers in training set
Top 5%
1.7%
16
eLife
5422 papers in training set
Top 46%
1.4%
17
Nature Biomedical Engineering
42 papers in training set
Top 1%
1.4%
18
Cell Reports Medicine
140 papers in training set
Top 5%
1.3%
19
Communications Biology
886 papers in training set
Top 13%
1.3%
20
Genome Medicine
154 papers in training set
Top 6%
1.2%
21
Nucleic Acids Research
1128 papers in training set
Top 14%
1.1%
22
Protein & Cell
25 papers in training set
Top 2%
0.8%
23
Nature Cell Biology
99 papers in training set
Top 5%
0.7%
24
Cell Research
49 papers in training set
Top 3%
0.7%
25
Cell
370 papers in training set
Top 18%
0.7%
26
Nature Genetics
240 papers in training set
Top 8%
0.7%
27
Genomics, Proteomics & Bioinformatics
171 papers in training set
Top 7%
0.7%
28
Cancer Research
116 papers in training set
Top 4%
0.6%