Back

simPIC: flexible simulation of single-cell ATAC-seq paired-insertion counts from individuals to populations

Chugh, S.; Shim, H. S.; McCarthy, D. J.

2026-05-15 bioinformatics
10.1101/2025.09.21.676689 bioRxiv
Show abstract

Single-cell Assay for Transposase Accessible Chromatin (scATAC-seq) is increasingly used at population scale to study how genetic variation shapes chromatin accessibility. Method development is limited by the lack of flexible simulation tools with known ground truth. Here, we present simPIC, a fast, memoryefficient framework for simulating realistic single-cell ATAC-seq count data across individuals and populations. simPIC models cell groups, batch effects, and genotype-dependent accessibility variation, enabling controlled evaluation of population-scale methods, including chromatin accessibility quantitative traits locus (QTL) mapping. Across multiple datasets and cell types, simPIC closely matches real data distributions while scaling to cohort sizes impractical for current tools.

Matching journals

The top 6 journals account for 50% of the predicted probability mass.

1
Genome Biology
555 papers in training set
Top 0.2%
12.2%
2
Nature Biotechnology
147 papers in training set
Top 0.6%
10.3%
3
Bioinformatics
1061 papers in training set
Top 3%
10.0%
4
Nature Communications
4913 papers in training set
Top 24%
8.1%
5
The American Journal of Human Genetics
206 papers in training set
Top 0.8%
6.3%
6
Nature Methods
336 papers in training set
Top 2%
4.8%
50% of probability mass above
7
Nature Genetics
240 papers in training set
Top 2%
4.8%
8
Nucleic Acids Research
1128 papers in training set
Top 4%
4.8%
9
Cell Systems
167 papers in training set
Top 3%
4.1%
10
Science
429 papers in training set
Top 9%
3.6%
11
Cell Genomics
162 papers in training set
Top 2%
2.3%
12
Genome Research
409 papers in training set
Top 2%
1.9%
13
Genome Medicine
154 papers in training set
Top 4%
1.8%
14
Nature
575 papers in training set
Top 11%
1.7%
15
Bioinformatics Advances
184 papers in training set
Top 3%
1.7%
16
PLOS Computational Biology
1633 papers in training set
Top 17%
1.6%
17
PLOS Genetics
756 papers in training set
Top 11%
1.2%
18
PLOS ONE
4510 papers in training set
Top 60%
1.2%
19
NAR Genomics and Bioinformatics
214 papers in training set
Top 3%
0.9%
20
Molecular Biology and Evolution
488 papers in training set
Top 4%
0.7%
21
Genetics
225 papers in training set
Top 4%
0.7%
22
BMC Bioinformatics
383 papers in training set
Top 7%
0.7%
23
iScience
1063 papers in training set
Top 35%
0.7%
24
Proceedings of the National Academy of Sciences
2130 papers in training set
Top 47%
0.7%
25
Nature Machine Intelligence
61 papers in training set
Top 4%
0.6%
26
Cell Reports Methods
141 papers in training set
Top 6%
0.6%
27
Frontiers in Genetics
197 papers in training set
Top 11%
0.6%
28
Molecular Plant
36 papers in training set
Top 2%
0.6%