Spatial-neighbour encoding enables fast RNA 3D structure search
Wang, D.; Jin, J.; Qiao, J.; Wei, L.; Wu, S.; Liu, Q.
Show abstract
Experimental and predicted RNA three-dimensional structures are expanding rapidly, but RNA structure search still lacks a compact residue-level representation that supports database-scale comparison. Using family-held-out ablations across the currently available experimental RNA structure collection, we found that spatial-neighbour features are markedly more informative for family-level discrimination than conventional backbone and base descriptors. Building on this result, we developed RiboSeek, a search framework based on a 20-letter geometric alphabet (RS-20), an 80-letter structure-and-base composite alphabet (RS-80). Across family-level classification and retrieval benchmarks, RS-80 delivered the strongest overall performance, whereas RS-20 most closely tracked US-align TM-score, indicating better preservation of geometric similarity. RiboSeek searches the full experimental RNA structure database in 204 ms per query and can be applied to predicted RNA structure libraries to prioritize candidate structural relationships for downstream analysis.
Matching journals
The top 4 journals account for 50% of the predicted probability mass.