sxRaep: A Rapid and Accurate Enzyme Predictor for high-throughput mining of enzymatic sequences
Duan, H.; Han, X.; Mo, Y.; Ren, B.; Xia, L. C.
Show abstract
MotivationMetagenomic sequencing generates petabyte-scale sequence datasets that strain both deep learning and alignment based enzyme annotation tools. A lightweight rapid and accurate filter tool is needed to identify enzymatic sequences prior to resource-intensive functional prediction. ResultsWe present sxRaep (Rapid and Accurate Enzyme Predictor), a resource-efficient framework using lightweight physicochemical features for enzyme pre-screening. sxRaep achieves 6,604-fold speedup over Diamond (0.002 seconds per inference) with 62.1% memory reduction relative to Diamond (372 MB peak), while maintaining 99.4% accuracy and the highest recall in remote homology detection. This lightweight approach identifies enzymatic candidates missed by alignment-based methods without sacrificing accuracy. Availability and ImplementationsxRaep is available as a Python package at https://pypi.org/project/raep/, is maintained as an open-source software repository at https://github.com/labxscut/sxRaep, and can be deployed using the Docker image cirinmok/raep:python3.11 (https://hub.docker.com/r/cirinmok/raep/tags), which provides a reproducible Python 3.11 environment for enzyme prediction and model execution. Contactlcxia@scut.edu.cn
Matching journals
The top 2 journals account for 50% of the predicted probability mass.