Back

Histone Modification Metapeaks are Epigenetic Landmarks Predictive of Cell State

Tanner, R. M.; Perkins, T. J.

2026-04-02 genomics
10.64898/2026.03.31.715657 bioRxiv
Show abstract

Histone modifications are a key component of the epigenetic state of a cell, and they vary widely across different cell and tissue types, conditions, and disease states. Indeed, the majority of the genome is enriched with one histone mark or another across the thousands of cellular conditions that have been studied to date. Here, we use the largest-to-date collection of histone modification ChIP-seq datasets to identify the most important sites of histone modifications genome-wide. Collected and uniformly reprocessed by the International Human Epigenome Consortium, this data includes 5339 datasets enriched at nearly one billion total peaks across 59 different major cell or tissue types and in healthy and disease conditions, for six different histone marks. We propose FindMetapeaks, a new approach to identifying histone mark metapeaks, which are genomic regions with enrichment of a mark across many samples. We show that many of these epigenetic metapeaks are strongly indicative of cell and tissue type, or are associated with other sample characteristics, and highlight key regulatory regions of the genome. However, we also show that many metapeaks contain redundant information, and that parsimonious subsets of metapeaks can be selected by machine learning to predict cell state. Our histone mark metapeak atlas provides a concise set of regions for interpreting the epigenome. Availabilityhttps://github.com/rmbioinfo83/FindMetapeaks/

Matching journals

The top 6 journals account for 50% of the predicted probability mass.

1
Genome Biology
555 papers in training set
Top 0.1%
14.8%
2
Nature Methods
336 papers in training set
Top 0.8%
10.5%
3
Nature Genetics
240 papers in training set
Top 0.8%
8.5%
4
Nature
575 papers in training set
Top 4%
7.2%
5
Nature Biotechnology
147 papers in training set
Top 1%
6.4%
6
Genome Research
409 papers in training set
Top 0.4%
6.4%
50% of probability mass above
7
Bioinformatics
1061 papers in training set
Top 4%
4.9%
8
Nature Communications
4913 papers in training set
Top 35%
4.3%
9
Cell Genomics
162 papers in training set
Top 1%
3.6%
10
Cell Systems
167 papers in training set
Top 5%
2.5%
11
Science
429 papers in training set
Top 11%
2.4%
12
The American Journal of Human Genetics
206 papers in training set
Top 2%
2.1%
13
Nucleic Acids Research
1128 papers in training set
Top 8%
2.1%
14
Genome Medicine
154 papers in training set
Top 4%
1.9%
15
PLOS Computational Biology
1633 papers in training set
Top 14%
1.9%
16
Proceedings of the National Academy of Sciences
2130 papers in training set
Top 29%
1.9%
17
Bioinformatics Advances
184 papers in training set
Top 4%
1.2%
18
Cell Reports
1338 papers in training set
Top 29%
1.1%
19
Cell
370 papers in training set
Top 15%
0.9%
20
Cell Reports Methods
141 papers in training set
Top 4%
0.9%
21
Nature Computational Science
50 papers in training set
Top 1%
0.9%
22
Nature Machine Intelligence
61 papers in training set
Top 3%
0.8%
23
Nature Neuroscience
216 papers in training set
Top 6%
0.8%
24
Briefings in Bioinformatics
326 papers in training set
Top 7%
0.7%
25
Scientific Reports
3102 papers in training set
Top 76%
0.7%
26
PLOS ONE
4510 papers in training set
Top 71%
0.6%
27
BMC Bioinformatics
383 papers in training set
Top 8%
0.5%