Back

BTEXgenie: A curated and user-friendly tool for profile HMM-based substrate-specific annotation of BTEX degradation genes

Qu, J.; Garber, A. I.; Armbruster, C. R.

2026-05-15 bioinformatics
10.64898/2026.05.12.724592 bioRxiv
Show abstract

BackgroundBenzene, toluene, ethylbenzene, and xylene (BTEX) are volatile aromatic hydrocarbons that are widespread environmental pollutants arising from petroleum processing, fuel combustion, and other industrial activities. Persistent BTEX contamination poses substantial risks to human health and ecosystems, underscoring the need for effective long term remediation strategies. Microbial bioremediation is a promising and sustainable approach for BTEX removal, but development of these approaches requires accurate detection of the genes and pathways responsible for substrate specific degradation. Although profile hidden Markov model (HMM) databases are widely used for functional annotation, existing annotation resources lack the substrate-specific resolution needed to distinguish between closely-related BTEX-degrading enzymes with different catalytic specificities. ResultsWe developed BTEXgenie as a sensitive annotation tool that uses custom HMMs built from alignments of experimentally validated BTEX degradation proteins to identify genes involved in the initial steps of aerobic and anaerobic BTEX degradation. BTEXgenie improved detection of anaerobic BTEX degradation genes that were absent from KOfam annotations. In benchmarking against the KEGG KOfam HMM database, BTEXgenie achieved 17.73%higher overall sensitivity while maintaining comparable specificity at 97.02%across genes involved in BTEX degradation pathways. When applied to environmental metagenomes, BTEXgenie recovered pathway patterns consistent with reported site characteristics and known degradation potential. In addition to gene annotation, BTEXgenie supports downstream interpretation through KEGG pathway-based visualization of detected functions and Circos-based visualization of genomic hit distributions. ConclusionsBTEXgenie is a substrate-specific annotation tool built from custom HMMs for detecting genes involved in BTEX degradation. By integrating gene annotation with pathway and genome-level visualizations, BTEXgenie facilitates characterization of microbial BTEX degradation potential in environmental and comparative genomic studies.

Matching journals

The top 8 journals account for 50% of the predicted probability mass.

1
Water Research
74 papers in training set
Top 0.2%
14.6%
2
Science of The Total Environment
179 papers in training set
Top 0.7%
10.3%
3
Environmental Science & Technology Letters
22 papers in training set
Top 0.1%
4.9%
4
PLOS ONE
4510 papers in training set
Top 30%
4.9%
5
Environment International
42 papers in training set
Top 0.3%
4.9%
6
mSystems
361 papers in training set
Top 2%
4.9%
7
Bioinformatics
1061 papers in training set
Top 5%
4.0%
8
Scientific Reports
3102 papers in training set
Top 34%
3.7%
50% of probability mass above
9
Computational and Structural Biotechnology Journal
216 papers in training set
Top 2%
2.8%
10
Journal of Hazardous Materials
19 papers in training set
Top 0.3%
2.6%
11
Environmental Research
46 papers in training set
Top 0.6%
2.5%
12
Environmental Science & Technology
64 papers in training set
Top 1%
2.1%
13
Frontiers in Microbiology
375 papers in training set
Top 4%
2.1%
14
BMC Bioinformatics
383 papers in training set
Top 4%
1.9%
15
mSphere
281 papers in training set
Top 3%
1.9%
16
Environmental Science: Water Research & Technology
13 papers in training set
Top 0.2%
1.8%
17
Nature Communications
4913 papers in training set
Top 51%
1.7%
18
Microbiome
139 papers in training set
Top 2%
1.7%
19
Microbiology Resource Announcements
22 papers in training set
Top 0.5%
1.2%
20
PLOS Computational Biology
1633 papers in training set
Top 19%
1.2%
21
ACS Synthetic Biology
256 papers in training set
Top 2%
1.1%
22
Microbiology Spectrum
435 papers in training set
Top 4%
1.1%
23
Scientific Data
174 papers in training set
Top 2%
0.9%
24
Toxicological Sciences
38 papers in training set
Top 0.5%
0.8%
25
ISME Communications
103 papers in training set
Top 2%
0.8%
26
The Journal of Infectious Diseases
182 papers in training set
Top 4%
0.8%
27
Limnology and Oceanography: Methods
11 papers in training set
Top 0.4%
0.8%
28
Frontiers in Plant Science
240 papers in training set
Top 6%
0.7%
29
Antimicrobial Resistance & Infection Control
10 papers in training set
Top 0.4%
0.5%
30
The American Journal of Tropical Medicine and Hygiene
60 papers in training set
Top 5%
0.5%