Integrating targeted genome mining and structure-guided modeling reveals unexplored 7-deazapurine-containing pathways

Cediel-Becerra, J. D. D.; Chevrette, M. G.; de Crecy-Lagard, V.; Dias, R.

2026-04-19 bioinformatics

10.64898/2026.04.15.718813 bioRxiv

Show abstract

7-deazapurines are nucleoside analogs that play key roles in nucleic acid modification and can serve as building blocks for diverse, bioactive secondary metabolites. Despite their biological significance, their biosynthetic diversity, distribution, and enzymatic determinants of structural diversification remain poorly understood. Here, we leverage large-scale targeted genome mining, phylogenetic, and network analysis to explore 7-deazapurine-containing pathways across [~]2 million bacterial genomes. We identified over 900 candidate biosynthetic gene clusters (BGCs), grouped into more than 100 families, most of which remain uncharacterized. These GATOR-GC-predicted BGCs were predominantly found in Streptomyces. We then examined enzyme-substrate interactions in three representative pathways: (i) peptidyl-deazapurines, (ii) huimycin, and (iii) dapiramicin A. Molecular docking and molecular dynamics (MD) simulations recapitulated known enzyme-substrate interactions and highlighted candidate catalytic residues governing amide bond formation, methylation, and glycosylation. Using this genome- and structure-guided framework, we identified a candidate BGC for dapiramicin A and proposed tailoring steps, including scaffold methylation and deoxy-sugar formation. These findings expand the known diversity of 7-deazapurine-containing BGCs and demonstrate how integrating genome mining with structural modeling can link BGCs to chemical function, providing a foundation for discovering and characterizing 7-deazapurine-containing secondary metabolites. Graphical abstract O_FIG O_LINKSMALLFIG WIDTH=200 HEIGHT=79 SRC="FIGDIR/small/718813v1_ufig1.gif" ALT="Figure 1"> View larger version (29K): org.highwire.dtl.DTLVardef@c00feforg.highwire.dtl.DTLVardef@156468forg.highwire.dtl.DTLVardef@1326e90org.highwire.dtl.DTLVardef@1f8d57b_HPS_FORMAT_FIGEXP M_FIG C_FIG

Integrating targeted genome mining and structure-guided modeling reveals unexplored 7-deazapurine-containing pathways

Matching journals