Back

Quantifying Structural Diversity of CNG Trinucleotide Repeats Using Diagrammatic Algorithms

Phan, E. N. H.; Mak, C. H.

2020-05-31 biophysics
10.1101/2020.05.30.124636 bioRxiv
Show abstract

Trinucleotide repeat expansion disorders (TREDs) exhibit complex mechanisms of pathogenesis, some of which have been attributed to RNA transcripts of overexpanded CNG repeats, resulting in possibly a gain-of-function. In this paper, we aim to probe the structures of these expanded transcript by analyzing the structural diversity of their conformational ensembles. We used graphs to catalog the structures of an NG-(CNG)16-CN and NG-(CNG)50-CN oligomer and grouped them into sub-ensembles based on their characters and calculated the structural diversity and thermodynamic stability for these ensembles using a previously described graph factorization scheme. Our findings show that the generally assumed structure for CNG repeats--a series of canonical helices connected by two-way junctions and capped with a hairpin loop--may not be the most thermodynamically favorable, and the ensembles are characterized by largely open and less structured conformations. Furthermore, a length-dependence is observed for the behavior of the ensembles diversity as higher-order diagrams are included, suggesting that further studies of CNG repeats are needed at the length scale of TREDs onset to properly understand their structural diversity and how this might relate to their functions. STATEMENT OF SIGNIFICANCETrinucleotide repeats are DNA satellites that are prone to mutations in the human genome. A family of diverse disorders are associated with an overexpansion of CNG repeats occurring in noncoding regions, and the RNA transcripts of the expanded regions have been implicated as the origin of toxicity. Our understanding of the structures of these expanded RNA transcripts is based on sequences that have limited lengths compared to the scale of the expanded transcripts found in patients. In this paper, we introduce a theoretical method aimed at analyzing the structure and conformational diversity of CNG repeats, which has the potential of overcoming the current length limitations in the studies of trinucleotide repeat sequences.

Matching journals

The top 7 journals account for 50% of the predicted probability mass.

1
Genes
126 papers in training set
Top 0.1%
12.5%
2
PLOS Computational Biology
1633 papers in training set
Top 4%
8.4%
3
Computational and Structural Biotechnology Journal
216 papers in training set
Top 0.4%
6.8%
4
PLOS ONE
4510 papers in training set
Top 25%
6.8%
5
Nucleic Acids Research
1128 papers in training set
Top 3%
6.4%
6
Scientific Reports
3102 papers in training set
Top 18%
6.3%
7
RNA
169 papers in training set
Top 0.1%
3.6%
50% of probability mass above
8
International Journal of Molecular Sciences
453 papers in training set
Top 3%
3.6%
9
Biochemistry and Biophysics Reports
28 papers in training set
Top 0.2%
2.9%
10
NAR Genomics and Bioinformatics
214 papers in training set
Top 1%
2.6%
11
Biophysical Journal
545 papers in training set
Top 2%
2.1%
12
Biomolecules
95 papers in training set
Top 0.3%
1.9%
13
RNA Biology
70 papers in training set
Top 0.2%
1.7%
14
Computational Biology and Chemistry
23 papers in training set
Top 0.2%
1.5%
15
Journal of Chemical Information and Modeling
207 papers in training set
Top 2%
1.2%
16
International Journal of Biological Macromolecules
65 papers in training set
Top 2%
1.2%
17
Physical Biology
43 papers in training set
Top 2%
1.2%
18
Frontiers in Molecular Biosciences
100 papers in training set
Top 3%
1.1%
19
Frontiers in Cell and Developmental Biology
218 papers in training set
Top 7%
1.0%
20
Bioinformatics Advances
184 papers in training set
Top 4%
1.0%
21
Cells
232 papers in training set
Top 4%
1.0%
22
Biosystems
18 papers in training set
Top 0.3%
0.9%
23
Biochemical and Biophysical Research Communications
78 papers in training set
Top 1%
0.9%
24
Biophysical Chemistry
14 papers in training set
Top 0.1%
0.8%
25
Frontiers in Genetics
197 papers in training set
Top 10%
0.7%
26
PeerJ
261 papers in training set
Top 15%
0.7%
27
BMC Bioinformatics
383 papers in training set
Top 7%
0.7%
28
Biology
43 papers in training set
Top 3%
0.7%
29
Journal of Structural Biology
58 papers in training set
Top 2%
0.6%
30
Journal of Biological Chemistry
641 papers in training set
Top 5%
0.6%