Back

Explainable Artificial Intelligence Reveals Potential Candidate Mechanism of Strain-Specific Drug Depletion

Elbadawi, M.; Abdul Kafoor, N. F.

2026-01-24 microbiology
10.64898/2026.01.23.701358 bioRxiv
Show abstract

Oral medications can be bioaccumulated or metabolised by gastrointestinal bacteria in a process collectively termed drug depletion. The precise biological mechanisms governing strain-specific depletion remain poorly understood, and systematic experimental classification of drug-strain interactions via in vitro studies is both costly and time-consuming. In this study, artificial intelligence (AI) methodologies combining machine learning (ML) and natural language processing (NLP) were applied to predict strain-specific drug depletion. The dataset comprised 16,802 drug-strain interaction pairs, with drugs represented by physicochemical descriptors and bacterial strains represented by whole-genome sequences. NLP techniques were used to transform genomic data into feature representations suitable for ML model training. The resulting models achieved strong predictive performance, with a balanced accuracy of 0.90 {+/-} 0.02 and Matthews correlation coefficient of 0.54 {+/-} 0.10. Feature importance analysis revealed that both drug properties and genomic features contributed to model predictions. Among the highest-ranking genomic features, BLASTX annotation identified several enzymes with known or plausible roles in drug metabolism. To further explore the mechanistic relevance of these features, two candidate enzymes were selected for molecular docking against drugs experimentally observed to be depleted. Glycosidase was found to possess binding energies of -8.69 and -7.88 kcal/mol for the two cardiac glycoside drugs digitoxin and digoxin, respectively; whereas acetyl-CoA carboxylase biotin carboxylase presented with binding energies for between -7.09 and -7.74 kcal/mol at one of its druggable sites. Collectively, these findings establish a proof-of-concept AI-driven framework that integrates predictive performance with mechanistic interpretability in the study of drug-microbiome interactions. The broader implications and limitations of applying AI in this context are also discussed. These preliminary findings offer a promising strategy for accelerating drug developments through using AI to rapidly highlight potential drug interactions. Graphical Abstract O_FIG O_LINKSMALLFIG WIDTH=200 HEIGHT=135 SRC="FIGDIR/small/701358v1_ufig1.gif" ALT="Figure 1"> View larger version (34K): org.highwire.dtl.DTLVardef@16cf66corg.highwire.dtl.DTLVardef@a640acorg.highwire.dtl.DTLVardef@e00ecdorg.highwire.dtl.DTLVardef@1ebbd62_HPS_FORMAT_FIGEXP M_FIG C_FIG

Matching journals

The top 5 journals account for 50% of the predicted probability mass.

1
Computational and Structural Biotechnology Journal
216 papers in training set
Top 0.1%
18.6%
2
Journal of Chemical Information and Modeling
207 papers in training set
Top 0.4%
14.3%
3
Advanced Science
249 papers in training set
Top 1%
10.1%
4
PLOS Computational Biology
1633 papers in training set
Top 9%
4.0%
5
Molecules
37 papers in training set
Top 0.3%
3.6%
50% of probability mass above
6
eLife
5422 papers in training set
Top 32%
2.6%
7
Journal of Medicinal Chemistry
68 papers in training set
Top 0.7%
1.7%
8
Scientific Reports
3102 papers in training set
Top 60%
1.7%
9
iScience
1063 papers in training set
Top 18%
1.5%
10
Chemical Science
71 papers in training set
Top 1%
1.5%
11
ACS Omega
90 papers in training set
Top 2%
1.5%
12
The Journal of Physical Chemistry Letters
58 papers in training set
Top 0.9%
1.5%
13
International Journal of Molecular Sciences
453 papers in training set
Top 10%
1.3%
14
Chemistry – A European Journal
13 papers in training set
Top 0.3%
1.3%
15
Briefings in Bioinformatics
326 papers in training set
Top 5%
1.3%
16
Communications Chemistry
39 papers in training set
Top 0.5%
1.2%
17
PLOS ONE
4510 papers in training set
Top 62%
1.1%
18
Pharmaceuticals
33 papers in training set
Top 1%
0.9%
19
Synthetic and Systems Biotechnology
10 papers in training set
Top 0.4%
0.9%
20
Biomedicine & Pharmacotherapy
43 papers in training set
Top 0.9%
0.9%
21
Genomics, Proteomics & Bioinformatics
171 papers in training set
Top 5%
0.9%
22
Angewandte Chemie International Edition
81 papers in training set
Top 3%
0.8%
23
npj Systems Biology and Applications
99 papers in training set
Top 2%
0.8%
24
Nature Communications
4913 papers in training set
Top 61%
0.8%
25
eBioMedicine
130 papers in training set
Top 4%
0.8%
26
Proceedings of the National Academy of Sciences
2130 papers in training set
Top 43%
0.8%
27
Computers in Biology and Medicine
120 papers in training set
Top 4%
0.8%
28
Frontiers in Pharmacology
100 papers in training set
Top 4%
0.8%
29
Frontiers in Chemistry
14 papers in training set
Top 0.4%
0.7%
30
ACS Infectious Diseases
74 papers in training set
Top 1%
0.7%