Back

Systematic Regional Bias is Widespread in ChIP-seq

Hughes, O.; Foley, G.; Balderson, B.; Piper, M.; Boden, M.

2026-05-13 bioinformatics
10.64898/2026.05.10.724164 bioRxiv
Show abstract

Robust and reproducible results are essential for confident scientific analysis. We demonstrate that transcription factor (TF) Chromatin Immunoprecipitation coupled with sequencing (ChIP-seq) suffers from systematic bias that may threaten its reproducibility: 80% of 200+ condition-matched, dual-replicate experiments in ENCODE contain genomic regions of systematic bias. We observe this regional bias even between replicates produced within the same experiment, resulting in thousands of unreplicated peaks, which often contain valuable biological data. We provide evidence that regional bias may lead to qualitative differences in TF biology inferred by different experiments; we discovered eight TFs with binding activity in compact chromatin that was identified by one experiment, yet systematically absent from others. To mitigate the effects of bias, we derive simple but effective metrics to quantify the quality of data within biased regions and demonstrate that they can be used for the robust integration of data from multiple experiments.

Matching journals

The top 6 journals account for 50% of the predicted probability mass.

1
Cell Systems
167 papers in training set
Top 0.2%
22.9%
2
Genome Biology
555 papers in training set
Top 0.8%
6.9%
3
Proceedings of the National Academy of Sciences
2130 papers in training set
Top 11%
6.4%
4
Nature Methods
336 papers in training set
Top 2%
4.9%
5
Nature Communications
4913 papers in training set
Top 32%
4.9%
6
Nature Biotechnology
147 papers in training set
Top 2%
4.2%
50% of probability mass above
7
Scientific Reports
3102 papers in training set
Top 34%
3.7%
8
Nucleic Acids Research
1128 papers in training set
Top 5%
3.7%
9
PLOS ONE
4510 papers in training set
Top 38%
3.7%
10
PLOS Computational Biology
1633 papers in training set
Top 11%
3.3%
11
Nature Genetics
240 papers in training set
Top 3%
2.9%
12
Science
429 papers in training set
Top 11%
2.5%
13
Bioinformatics
1061 papers in training set
Top 6%
2.5%
14
The American Journal of Human Genetics
206 papers in training set
Top 2%
2.1%
15
Genetics
225 papers in training set
Top 2%
1.9%
16
Genome Research
409 papers in training set
Top 2%
1.8%
17
NAR Genomics and Bioinformatics
214 papers in training set
Top 2%
1.7%
18
Nature
575 papers in training set
Top 11%
1.7%
19
eLife
5422 papers in training set
Top 53%
0.9%
20
Science Advances
1098 papers in training set
Top 26%
0.9%
21
Cell Genomics
162 papers in training set
Top 6%
0.8%
22
Biophysical Journal
545 papers in training set
Top 5%
0.8%
23
Communications Biology
886 papers in training set
Top 20%
0.8%
24
iScience
1063 papers in training set
Top 31%
0.8%
25
Cell Reports Methods
141 papers in training set
Top 5%
0.8%
26
BMC Bioinformatics
383 papers in training set
Top 7%
0.8%
27
Genome Medicine
154 papers in training set
Top 8%
0.8%
28
Cell
370 papers in training set
Top 17%
0.7%
29
Nature Machine Intelligence
61 papers in training set
Top 4%
0.7%
30
Nature Computational Science
50 papers in training set
Top 2%
0.7%