Back

Defining the DNA Binding Specificity of GRHL2

Messa, P. E.; Warren, C. L.; Nicol, N. R.; Pearson, K. S.; Peters, J. P.; Fowler, A. M.; Alarid, E. T.; Ozers, M. S.

2026-04-18 biochemistry
10.64898/2026.04.16.719077 bioRxiv
Show abstract

Grainyhead-like 2 (GRHL2) is an epithelial transcription factor with context-dependent regulatory roles, yet the sequence rules governing its DNA recognition remain incompletely defined. In this study, a high-density genomic Specificity and Affinity for Protein (SNAP) DNA-binding array containing 772,732 tiled probes derived from GRHL2 ChIP-seq regions was used to resolve GRHL2 binding specificity at 6 base pair resolution across genomic sequences. From high-affinity probes, de novo motif analysis recovered the canonical 5-AACCGGTT-3 motif. Sequence specificity landscapes revealed a stepwise reduction in binding as mismatches were introduced, with the strongest effects at the C (position 3) and G (position 6) within the motif, greater tolerance at the central CG dinucleotide, and intermediate tolerance at the A/T bases at the motif edges. This analysis also demonstrated the influence of nearby flanking sequences. Extended motif and spacing analyses indicated dimeric binding at paired motifs, with periodic helical spacing consistent with interactions on the same face of the DNA helix. Integration of SNAP array binding with ChIP-seq data distinguished direct, motif-encoded GRHL2 occupancy from indirect, cofactor-mediated recruitment at genomic sites. These results define the sequence specificity of GRHL2 interactions with variations in the DNA consensus motif and flanking sequences within an endogenous genomic context. Graphical Abstract O_FIG O_LINKSMALLFIG WIDTH=200 HEIGHT=77 SRC="FIGDIR/small/719077v1_ufig1.gif" ALT="Figure 1"> View larger version (21K): org.highwire.dtl.DTLVardef@1a28904org.highwire.dtl.DTLVardef@1d197aforg.highwire.dtl.DTLVardef@13d9e97org.highwire.dtl.DTLVardef@76d55f_HPS_FORMAT_FIGEXP M_FIG C_FIG

Matching journals

The top 7 journals account for 50% of the predicted probability mass.

1
Nucleic Acids Research
1128 papers in training set
Top 0.5%
18.9%
2
Computational and Structural Biotechnology Journal
216 papers in training set
Top 0.1%
10.2%
3
Biochimica et Biophysica Acta (BBA) - Gene Regulatory Mechanisms
14 papers in training set
Top 0.1%
6.5%
4
Scientific Reports
3102 papers in training set
Top 17%
6.4%
5
eLife
5422 papers in training set
Top 19%
4.4%
6
International Journal of Molecular Sciences
453 papers in training set
Top 3%
3.6%
7
PLOS ONE
4510 papers in training set
Top 38%
3.6%
50% of probability mass above
8
Journal of Molecular Biology
217 papers in training set
Top 0.7%
3.1%
9
PLOS Computational Biology
1633 papers in training set
Top 12%
2.6%
10
Life Science Alliance
263 papers in training set
Top 0.1%
2.6%
11
Biochemistry
130 papers in training set
Top 0.5%
2.4%
12
Cell Communication and Signaling
35 papers in training set
Top 0.3%
1.9%
13
Journal of Biological Chemistry
641 papers in training set
Top 2%
1.7%
14
Open Biology
95 papers in training set
Top 0.6%
1.7%
15
The FEBS Journal
78 papers in training set
Top 0.2%
1.7%
16
Nature Communications
4913 papers in training set
Top 55%
1.3%
17
Heliyon
146 papers in training set
Top 3%
1.3%
18
BMC Genomic Data
12 papers in training set
Top 0.1%
1.3%
19
JACS Au
35 papers in training set
Top 0.6%
1.3%
20
Plant Direct
81 papers in training set
Top 2%
1.0%
21
Cell Reports
1338 papers in training set
Top 29%
1.0%
22
Protein Science
221 papers in training set
Top 1%
0.9%
23
Redox Biology
64 papers in training set
Top 0.8%
0.8%
24
ACS Omega
90 papers in training set
Top 3%
0.8%
25
The FASEB Journal
175 papers in training set
Top 3%
0.8%
26
Frontiers in Cell and Developmental Biology
218 papers in training set
Top 9%
0.8%
27
Epigenetics
43 papers in training set
Top 1%
0.7%
28
Briefings in Bioinformatics
326 papers in training set
Top 7%
0.7%
29
GigaScience
172 papers in training set
Top 4%
0.7%
30
Advanced Science
249 papers in training set
Top 23%
0.5%