Back

Conservation of Long G4-rich (LG4) genomic enhancer regulations

Shaw, M. H.; DeMeis, J. D.; Arnold, C. A.; Cox, M. R.; Duong, T. C.; Gaviria, K. A.; McDavid, G. K.; Villegas, J. M.; Weimer, M. L.; Patil, S. S.; Alqudah, S. Y.; Borchert, G. M.

2026-03-13 genomics
10.64898/2026.03.11.711068 bioRxiv
Show abstract

Long G4-rich regions (LG4s) are defined as DNA sequences containing a high density of guanine triplets capable of forming non-B DNA structures called G-quadruplexes (G4s). These regions frequently overlap with enhancers, which are regulatory DNA elements that modulate gene expression by interacting with DNA regions that dictate where transcription is initiated known as promoters. While LG4s have now been well-characterized in the human genome, neither LG4 occurrence, nor the ability of LG4s to function as enhancers, in other species has been described. To address this, we screened the genomes of 16 different species from various taxa to identify LG4s and then determined if they were conserved, and if so, if their regulatory capacity was similarly conserved. Our analyses characterized a number of previously unreported LG4s in the human genome as well as LG4s in 13 additional species. Of note, we identified a highly conserved LG4 enhancer predicted to regulate over 40 genes. This LG4 is embedded in the MAZ (Myc-Associated Zinc finger protein) locus, and we find this LG4 possesses the ability to directly interact with the same target promoter in both human and mouse. In summary, this work describes LG4s in the genomes of both unicellular and multicellular species including vertebrates, invertebrates, plants, and fungi. Furthermore, many of these LG4 sequences are highly conserved as is their regulatory capacity.

Matching journals

The top 9 journals account for 50% of the predicted probability mass.

1
Nucleic Acids Research
1128 papers in training set
Top 1%
11.9%
2
PLOS Genetics
756 papers in training set
Top 1%
9.7%
3
Nature Communications
4913 papers in training set
Top 27%
6.6%
4
Mobile DNA
27 papers in training set
Top 0.1%
6.1%
5
Frontiers in Genetics
197 papers in training set
Top 2%
4.1%
6
PLOS ONE
4510 papers in training set
Top 41%
3.5%
7
Scientific Reports
3102 papers in training set
Top 40%
3.5%
8
Genome Research
409 papers in training set
Top 1%
3.5%
9
Genome Biology and Evolution
280 papers in training set
Top 0.5%
3.5%
50% of probability mass above
10
BMC Genomics
328 papers in training set
Top 1%
3.5%
11
Genes
126 papers in training set
Top 0.4%
3.0%
12
eLife
5422 papers in training set
Top 30%
3.0%
13
Genome Biology
555 papers in training set
Top 3%
3.0%
14
NAR Genomics and Bioinformatics
214 papers in training set
Top 1%
2.8%
15
PLOS Computational Biology
1633 papers in training set
Top 15%
1.8%
16
Cell Reports
1338 papers in training set
Top 25%
1.6%
17
G3 Genes|Genomes|Genetics
351 papers in training set
Top 1%
1.6%
18
BMC Biology
248 papers in training set
Top 1%
1.6%
19
Genomics, Proteomics & Bioinformatics
171 papers in training set
Top 4%
1.3%
20
Proceedings of the National Academy of Sciences
2130 papers in training set
Top 40%
0.9%
21
Genomics
60 papers in training set
Top 2%
0.9%
22
The Plant Journal
197 papers in training set
Top 3%
0.9%
23
BMC Genomic Data
12 papers in training set
Top 0.1%
0.9%
24
Genetics
225 papers in training set
Top 4%
0.8%
25
BMC Cancer
52 papers in training set
Top 3%
0.7%
26
iScience
1063 papers in training set
Top 35%
0.7%
27
Philosophical Transactions of the Royal Society B
51 papers in training set
Top 6%
0.7%
28
Molecular Biology and Evolution
488 papers in training set
Top 5%
0.7%
29
Journal of Genetics and Genomics
36 papers in training set
Top 2%
0.7%
30
Science Advances
1098 papers in training set
Top 31%
0.7%