Back

Combinatorial epigenomic patterns define regulatory programs underlying disease heterogeneity

Shim, W. J.; Bao, S. C.; Chow, C. S. Y.; Mizikovsky, D.; Shen, S.; Riedlshah, Z.; Zhao, Q.; Boden, M.; Palpant, N.

2026-05-05 genomics
10.64898/2026.05.01.722123 bioRxiv
Show abstract

Disease is a heterogeneous process that involves multiple organs and cell types. Understanding how genomic variation contributes to disease requires approaches that move beyond the linear assumptions of additive models and resolve underlying disease pathways. While genome-wide association studies have catalogued hundreds of thousands of genomic variants linked to disease, our understanding of their cell-type specific roles remains largely limited, restricting our ability to translate genetic findings into targeted interventions. Here, we analyse consortium-scale epigenomic data spanning 833 biological samples across 8 epigenetic features to develop a generalisable machine learning framework that models the modular architecture of genome regulation. We define 720 epigenomic signatures, Epigenetically Co-Modulated Patterns (EpiCops), that capture co-regulated genomic regions with tissue and cell-specific regulatory activity. Using EpiCops, we effectively segregate functional genomic loci of mixed biological contexts, including cell-type specific enhancers, variants of complex traits and diseases. Applied to type-2-diabetes, EpiCops identify variant clusters associated with distinct biological pathways and organs, including clusters of opposing cardiovascular risk profiles driven by divergent organ-specific regulatory mechanisms. By integrating EpiCops with partitioned polygenic risk score, we further validate robustness of these variant clusters in independent cohort studies. Collectively, our study demonstrates EpiCops as a scalable framework for resolving the cell-type specific regulatory architecture of complex disease and advancing mechanistic understanding of disease processes.

Matching journals

The top 5 journals account for 50% of the predicted probability mass.

1
Nature Genetics
240 papers in training set
Top 0.3%
18.2%
2
Cell Genomics
162 papers in training set
Top 0.1%
12.0%
3
Nature Communications
4913 papers in training set
Top 24%
8.2%
4
Science
429 papers in training set
Top 4%
8.2%
5
Nature
575 papers in training set
Top 5%
6.2%
50% of probability mass above
6
Cell
370 papers in training set
Top 3%
6.1%
7
Genome Biology
555 papers in training set
Top 3%
3.5%
8
Genome Medicine
154 papers in training set
Top 2%
3.5%
9
Cell Reports
1338 papers in training set
Top 19%
2.7%
10
Proceedings of the National Academy of Sciences
2130 papers in training set
Top 25%
2.7%
11
Science Translational Medicine
111 papers in training set
Top 2%
2.3%
12
The American Journal of Human Genetics
206 papers in training set
Top 2%
1.8%
13
Nature Biotechnology
147 papers in training set
Top 4%
1.7%
14
Nature Neuroscience
216 papers in training set
Top 4%
1.6%
15
Nature Cell Biology
99 papers in training set
Top 3%
1.6%
16
Cell Systems
167 papers in training set
Top 8%
1.4%
17
Molecular Cell
308 papers in training set
Top 8%
1.4%
18
Science Advances
1098 papers in training set
Top 22%
1.3%
19
Nature Medicine
117 papers in training set
Top 4%
1.1%
20
Nature Aging
51 papers in training set
Top 2%
0.7%
21
Nature Machine Intelligence
61 papers in training set
Top 4%
0.7%
22
eLife
5422 papers in training set
Top 60%
0.7%
23
Journal of Clinical Investigation
164 papers in training set
Top 8%
0.6%
24
Nature Cardiovascular Research
28 papers in training set
Top 0.7%
0.6%