Back

Spectral Graph Features for Reference-free RNA 3D Quality Assessment

Zhu, Y.; Zhang, H.; Calhoun, V. D.; Bi, Y.

2026-04-09 bioinformatics
10.64898/2026.04.06.716854 bioRxiv
Show abstract

MotivationExisting RNA 3D structure quality assessment (QA) methods rely on local geometric descriptors or statistical potentials that evaluate atomic-level contacts but are blind to global topological coherence. This creates a critical failure mode--structures that are "locally correct but globally wrong"--where well-formed local helices mask misplaced domains and incorrect overall packing. ResultsWe introduce SpecRNA-QA, a lightweight RNA QA method based on multi-scale graph-Laplacian features of inter-nucleotide contact networks. In CASP16 leave-one-out cross-validation, it achieves median per-target Spearman{rho} = 0.69 (target-clustered bootstrap 95% CI [0.64, 0.73]) versus 0.47 for an internal geometry baseline--a +0.22 gap that is significant at p = 1.2 x 10-10 (one-sided Wilcoxon signed-rank) and reflects a per-target win rate of 93%. The gain is concentrated on large, multi-domain RNAs, where global coherence is poorly captured by local descriptors. In a contextual comparison with established statistical potentials, local energy-based scores remain strongest on compact RNAs, while SpecRNA-QA yields the strongest signal we observed on targets longer than 200 nt; within the single-threaded runtime budget used here, the strongest local-energy comparator, rsRNASP, timed out on 22 of 26 large targets, and we report an explicit paired head-to-head on the four commonly scored targets in Section 4.2. A training-free heuristic variant further shows that the spectral prior carries intrinsic quality information even in the absence of labeled QA data. AvailabilitySpecRNA-QA is available as a Python package at https://github.com/yudabitrends/specrnaq. Contactybi3@gsu.edu Supplementary informationSupplementary data are available online. Key PointsO_LISpecRNA-QA uses multi-scale graph-Laplacian spectra to score global RNA fold coherence that local geometric descriptors and local statistical potentials can miss. C_LIO_LIThe method uncovers a size-dependent division of labor: on compact RNAs that can be scored exhaustively, atom-level statistical potentials such as rsRNASP remain strongest, whereas on >200 nt RNAs--where the strongest local comparator times out on most targets under the single-threaded runtime budget used here--SpecRNA-QA provides the strongest signal we observed. C_LIO_LIHeat-kernel traces at intermediate diffusion times emerge as the most discriminative spectral features and form an interpretable bridge between local packing and long-range tertiary organization. C_LIO_LIA training-free heuristic variant of SpecRNA-QA retains informative spectral signal without any labeled QA data, supporting the interpretation of the learned model as amplifying a real structural signal rather than overfitting one. C_LI

Matching journals

The top 5 journals account for 50% of the predicted probability mass.

1
Bioinformatics
1061 papers in training set
Top 2%
16.7%
2
Nature Biotechnology
147 papers in training set
Top 0.6%
11.8%
3
Nature Methods
336 papers in training set
Top 0.8%
11.8%
4
Nature Communications
4913 papers in training set
Top 24%
8.0%
5
Cell Systems
167 papers in training set
Top 2%
6.8%
50% of probability mass above
6
Nucleic Acids Research
1128 papers in training set
Top 3%
6.5%
7
PLOS Computational Biology
1633 papers in training set
Top 6%
6.0%
8
Bioinformatics Advances
184 papers in training set
Top 1%
4.0%
9
Proceedings of the National Academy of Sciences
2130 papers in training set
Top 22%
3.4%
10
Briefings in Bioinformatics
326 papers in training set
Top 2%
2.7%
11
Nature
575 papers in training set
Top 12%
1.4%
12
Genome Biology
555 papers in training set
Top 5%
1.3%
13
Structure
175 papers in training set
Top 2%
1.2%
14
NAR Genomics and Bioinformatics
214 papers in training set
Top 3%
1.2%
15
RNA
169 papers in training set
Top 0.4%
0.9%
16
Communications Biology
886 papers in training set
Top 20%
0.9%
17
Nature Machine Intelligence
61 papers in training set
Top 3%
0.9%
18
Cell Reports Methods
141 papers in training set
Top 5%
0.8%
19
Genome Research
409 papers in training set
Top 4%
0.8%
20
Science
429 papers in training set
Top 20%
0.7%
21
Molecular Biology and Evolution
488 papers in training set
Top 5%
0.7%
22
Nature Genetics
240 papers in training set
Top 8%
0.7%
23
Journal of Chemical Information and Modeling
207 papers in training set
Top 3%
0.7%
24
Nature Computational Science
50 papers in training set
Top 2%
0.7%
25
Biomolecules
95 papers in training set
Top 3%
0.6%
26
Biology Methods and Protocols
53 papers in training set
Top 3%
0.6%