Back

Exploring the accuracy of ab initio prediction methods for viral pseudoknotted RNA structures

Medeiros, V. M.; Pearl, J. M.; Carboni, M.; Er, E. M.; Zafeiri, S.

2024-03-26 bioinformatics Community evaluation
10.1101/2024.03.21.586060 bioRxiv
Show abstract

The prediction of tertiary RNA structures is significant to the field of medicine (e.g. mRNA vaccines, genome editing), and the exploration of viral transcripts. Though many RNA folding software exist, few studies have condensed their locus of attention solely to viral pseudoknotted RNA. These regulatory pseudoknots play a role in genome replication, gene expression, and protein synthesis. This study explores five RNA folding engines that compute either the minimum free energy (MFE) or the maximum expected accuracy (MEA). These folding engines were tested against 26 experimentally derived short pseudoknotted sequences (20-150nt) using metrics that are commonly applied to software prediction accuracy (e.g. F1 scoring, PPV). This paper reports higher accuracy RNA prediction engines, such as pKiss, when compared to previous iterations of the software, and when compared to older folding engines. They show that MEA folding software does not always outperform MFE folding software in prediction accuracy when assessed with metrics such as percent error, sensitivity, PPV, and F1 scoring when applied to viral pseudoknotted RNA. Moreover, the results suggest that thermodynamic model parameters will not ensure accuracy if auxiliary parameters such as Mg2+ binding, dangling end options, and H-type penalties are not applied. The observations reported in this paper highlight the quality between different ab initio prediction methods while enforcing the idea that a better understanding of intracellular thermodynamics is necessary for a more efficacious screening of RNAs. ImportanceThe importance of accurately predicting RNA structures cannot be overstated, particularly in the context of viral biology and the development of therapeutic interventions such as mRNA vaccines and genome editing. Our study addresses the gap in the existing literature by concentrating solely on viral pseudoknotted RNA, which plays a crucial role in viral replication, gene expression, and protein synthesis. Our study sheds light on the debate surrounding minimum free energy (MFE) versus maximum expected accuracy (MEA) models in RNA folding predictions. Contrary to existing beliefs, we found that MEA models do not consistently outperform MFE models, especially in the context of viral pseudoknotted RNAs. Our research contributes to advancing the field of computational biology by providing insights into the efficacy of different prediction methods and emphasizing the need for a deeper understanding of intracellular thermodynamics to improve RNA structure predictions.

Matching journals

The top 6 journals account for 50% of the predicted probability mass.

1
PLOS Computational Biology
1633 papers in training set
Top 1%
18.2%
2
Computational and Structural Biotechnology Journal
216 papers in training set
Top 0.2%
9.0%
3
BMC Bioinformatics
383 papers in training set
Top 1%
8.2%
4
Computational Biology and Chemistry
23 papers in training set
Top 0.1%
6.2%
5
PLOS ONE
4510 papers in training set
Top 35%
4.2%
6
Briefings in Bioinformatics
326 papers in training set
Top 1%
4.2%
50% of probability mass above
7
The Journal of Physical Chemistry B
158 papers in training set
Top 0.5%
3.9%
8
Journal of Chemical Information and Modeling
207 papers in training set
Top 1%
3.6%
9
Scientific Reports
3102 papers in training set
Top 39%
3.5%
10
Computers in Biology and Medicine
120 papers in training set
Top 1.0%
3.5%
11
Bioinformatics
1061 papers in training set
Top 6%
2.7%
12
International Journal of Molecular Sciences
453 papers in training set
Top 5%
2.0%
13
PeerJ
261 papers in training set
Top 6%
1.8%
14
Frontiers in Genetics
197 papers in training set
Top 5%
1.7%
15
Physical Biology
43 papers in training set
Top 1%
1.6%
16
Journal of Bioinformatics and Systems Biology
14 papers in training set
Top 0.2%
1.5%
17
Biology Methods and Protocols
53 papers in training set
Top 1%
1.5%
18
Biophysical Journal
545 papers in training set
Top 3%
1.5%
19
F1000Research
79 papers in training set
Top 3%
1.2%
20
RNA Biology
70 papers in training set
Top 0.3%
1.2%
21
Frontiers in Bioinformatics
45 papers in training set
Top 0.6%
0.9%
22
RNA
169 papers in training set
Top 0.4%
0.9%
23
NAR Genomics and Bioinformatics
214 papers in training set
Top 3%
0.9%
24
Proteins: Structure, Function, and Bioinformatics
82 papers in training set
Top 0.8%
0.9%
25
Viruses
318 papers in training set
Top 5%
0.8%
26
Bioinformatics Advances
184 papers in training set
Top 5%
0.7%
27
Journal of Chemical Theory and Computation
126 papers in training set
Top 0.9%
0.7%
28
Nucleic Acids Research
1128 papers in training set
Top 18%
0.7%
29
Journal of Molecular Biology
217 papers in training set
Top 4%
0.6%
30
ACS Omega
90 papers in training set
Top 5%
0.6%