Computational pipeline reveals nature's untapped reservoir of halogenating enzymes
Szenei, J.; Burke, A.; Liong, A.; Korenskaia, A.; Lukowski, A. L.; Ziemert, N.; Nikel, P. I.; Leao, P. N.; Moore, B. S.; Weber, T.; Blin, K.
Show abstract
Microbial halogenated natural products (hNPs) hold ecological, agricultural, and biomedical relevance. The hNP-producing potential of the organism can be assessed by the precise prediction of biosynthetic enzymes, yet the detailed annotations of halogenases are often missing from genomic and metagenomic data. We created a manually curated database (https://halogenases.secondarymetabolites.org/) containing information on the halide-specificity, role, and position of verified catalytic residues and results of the mutagenesis studies of more than 120 experimentally validated or in silico inferred halogenases. The collection of experimental data supports a computational pipeline that allows the family-, substrate-, and halide-scope-level annotation of halogenating enzymes by relying on catalytic residues, conserved motifs, and profile Hidden Markov Models (pHMMs). Our analysis with sequence similarity networks (SSNs) highlighted several underexplored clusters in the UniRef50 database. Such finding was a halogenase from Rhodopirellula baltica (RhobaVHPO) previously labelled as a hypothetical chloroperoxidase, which clustered apart from the known chloroperoxidases and bromoperoxidases, but accepted chloride and preferred bromide. Our database and workflow provide extensive and scalable solutions for the systematic and precise annotation of halogenating enzymes in genomic and metagenomic data. The in-depth categorization of halogenases will improve the chemical structure prediction of microbial hNPs, supporting ecological assessments and natural product discovery. O_FIG O_LINKSMALLFIG WIDTH=200 HEIGHT=112 SRC="FIGDIR/small/700248v1_ufig1.gif" ALT="Figure 1"> View larger version (45K): org.highwire.dtl.DTLVardef@ebae51org.highwire.dtl.DTLVardef@10188f0org.highwire.dtl.DTLVardef@1c55684org.highwire.dtl.DTLVardef@b311bd_HPS_FORMAT_FIGEXP M_FIG C_FIG
Matching journals
The top 13 journals account for 50% of the predicted probability mass.