Joint enzyme-reaction retrieval and catalytic optima prediction via multimodal fusion
Cai, Y.; Yang, F.; Liu, J.
Show abstract
MotivationEnzyme-reaction retrieval is increasingly used to prioritize candidate biocatalysts for experimental follow-up, where useful recommendations should indicate not only whether an enzyme can catalyze a target reaction but also under which pH and temperature conditions it should be tested. Existing retrieval models optimize catalytic matching scores, whereas catalytic optima predictors are typically developed as enzyme-level regressors because public pH and temperature annotations are sparse and often available only at the enzyme or EC-associated record level. This separation leaves a practical gap: high-ranking enzyme-reaction pairs are not evaluated for condition suitability, and enzyme-level optima predictions do not use the reaction context being retrieved. ResultsWe present GERO, a multimodal fusion framework that uses feature-gated cross-modal fusion to integrate global enzyme sequence semantics, sequence-derived pocket geometry, and molecular reaction representations for condition-aware enzyme-reaction retrieval and catalytic optima estimation with reaction context. To evaluate this setting, we define the tolerance-restricted hit rate (Hit@k-TR), which requires both top-k retrieval of the correct candidate and condition prediction within predefined tolerances. Across enzyme- and reaction-similarity splits, GERO improves Hit@k-TR over two-stage retrieval-then-prediction baselines. Representative benchmark examples and an iodinin biosynthesis case study further illustrate GEROs ability to provide candidate rankings together with plausible assay-condition estimates for downstream experimental prioritization. Availability and implementationSource code is available at https://github.com/ykxhs/GERO. Contactliujuan@whu.edu.cn Supplementary informationSupplementary data are available at XXXX online.
Matching journals
The top 3 journals account for 50% of the predicted probability mass.