Back

Epigenomic methylome landscape of promoters in vertebrate genomes

Lee, Y. H.; Lee, C.; Jarvis, E.; Kim, H.

2026-03-30 bioinformatics
10.64898/2026.03.29.715150 bioRxiv
Show abstract

Genomic promoters are crucial gene regulatory elements1,2. Yet, comparative analyses of promoter architecture have been constrained by the limited resolution of GC-rich regions in short-read-based genome resources3-6. The Vertebrate Genomes Project (VGP) provides more complete long-read-based assemblies7, which further detect 5-methylcytosine signals directly from PacBio HiFi circular consensus reads8,9. Here, we developed a scalable computational framework to characterize DNA methylomes from HiFi data on high-quality Phase I VGP assemblies with RefSeq gene annotations for 82 vertebrate species spanning seven major taxonomic classes: mammals, birds, reptiles, amphibians, lobe-finned fishes, ray-finned fishes, and cartilaginous fishes. We observed a conserved, transcription start site-centered hypomethylation signature in promoters across all vertebrates, and an unexpected hypermethylation signature near gene boundaries that is discordant with transcripts. In addition to this conserved pattern, there were lineage-specific differences in promoter methylation profiles, with birds showing the most diverse patterns. These epigenetic landscapes track phylogenetic relationships more closely than tissue-type methylation differences and infer lineage-dependent widths of core promoters and broader promoters across major vertebrate classes. Our findings establish a comparative epigenomic framework for profiling promoter methylomes from long-read sequencing data.

Matching journals

The top 4 journals account for 50% of the predicted probability mass.

1
Nature Biotechnology
147 papers in training set
Top 0.1%
22.7%
2
Nature Communications
4913 papers in training set
Top 10%
14.5%
3
Nature Methods
336 papers in training set
Top 0.9%
10.2%
4
Nature
575 papers in training set
Top 4%
6.9%
50% of probability mass above
5
Genome Biology
555 papers in training set
Top 1%
4.9%
6
Science Advances
1098 papers in training set
Top 8%
3.3%
7
Nucleic Acids Research
1128 papers in training set
Top 7%
2.9%
8
Bioinformatics
1061 papers in training set
Top 6%
2.6%
9
Cell Genomics
162 papers in training set
Top 2%
2.1%
10
Advanced Science
249 papers in training set
Top 8%
2.1%
11
Communications Biology
886 papers in training set
Top 7%
1.8%
12
Genome Medicine
154 papers in training set
Top 4%
1.7%
13
Nature Genetics
240 papers in training set
Top 4%
1.7%
14
Nano Letters
63 papers in training set
Top 2%
1.3%
15
Genome Research
409 papers in training set
Top 3%
1.3%
16
Briefings in Bioinformatics
326 papers in training set
Top 5%
1.0%
17
Nature Machine Intelligence
61 papers in training set
Top 3%
0.9%
18
Cell Systems
167 papers in training set
Top 10%
0.9%
19
Molecular Cell
308 papers in training set
Top 9%
0.8%
20
Nature Plants
84 papers in training set
Top 2%
0.8%
21
Scientific Reports
3102 papers in training set
Top 74%
0.8%
22
Nature Chemical Biology
104 papers in training set
Top 4%
0.7%
23
Cell Reports Methods
141 papers in training set
Top 5%
0.7%
24
Genomics, Proteomics & Bioinformatics
171 papers in training set
Top 7%
0.6%
25
Cancer Research
116 papers in training set
Top 4%
0.5%
26
Nature Computational Science
50 papers in training set
Top 2%
0.5%
27
iScience
1063 papers in training set
Top 40%
0.5%
28
Nature Structural & Molecular Biology
218 papers in training set
Top 6%
0.5%
29
Cell Research
49 papers in training set
Top 3%
0.5%