Back

Comparing bulk and single-cell methodologies and models to profile gene expression, chromatin accessibility and regulatory links in endothelial cells treated with TNFα

Zevounou, J.; Lo, K. S.; McGinnis, C. S.; Satpathy, A. T.; Lettre, G.

2026-03-16 genomics
10.64898/2026.03.13.711357 bioRxiv
Show abstract

Genome-wide association studies (GWAS) have identified thousands of non-coding variants associated with complex traits and diseases. However, it remains challenging to pinpoint the causal genes that are regulated by associated genetic variants. Connecting causal non-coding variants with genes can rely on methods that identify direct physical interactions (e.g. chromosome conformation capture) or on probabilistic models that predict regulatory links. These statistical models take advantage of gene expression and chromatin accessibility profiles generated in cells and tissues by bulk or single-cell (sc) methodologies. Here, we tested whether using bulk or sc RNAseq/ATACseq data and corresponding predictive enhancer-to-gene models impact the prioritization of causal GWAS genes. Using non-treated and TNF-treated human endothelial cells in vitro as a well-controlled experimental system, we show that bulk and sc RNAseq/ATACseq profiles are similar and highlight the same biology (e.g. biological pathways). Despite these similarities, we show using GWAS results for coronary artery disease (CAD) and diastolic blood pressure that applying enhancer-to-gene models designed for bulk or sc methodologies can yield differences in terms of captured heritability, fine-mapped variants and linked genes. For instance, at one CAD locus, the bulk-based ABC model predicts a regulatory link with BCAR1, whereas the sc-based model scE2G prioritizes a different gene (CFDP1). On the same experimental model, our results indicate that choosing between a bulk or sc approach will influence regulatory link model predictions; this should be considered when planning functional experiments to characterize GWAS discoveries.

Matching journals

The top 4 journals account for 50% of the predicted probability mass.

1
PLOS Computational Biology
1633 papers in training set
Top 0.7%
22.6%
2
Scientific Reports
3102 papers in training set
Top 6%
10.1%
3
Frontiers in Genetics
197 papers in training set
Top 0.3%
10.1%
4
Bioinformatics
1061 papers in training set
Top 3%
7.2%
50% of probability mass above
5
Nucleic Acids Research
1128 papers in training set
Top 4%
4.3%
6
NAR Genomics and Bioinformatics
214 papers in training set
Top 0.6%
3.9%
7
Genetic Epidemiology
46 papers in training set
Top 0.2%
3.6%
8
PLOS Genetics
756 papers in training set
Top 6%
2.6%
9
Nature Communications
4913 papers in training set
Top 45%
2.6%
10
Communications Biology
886 papers in training set
Top 4%
2.5%
11
The American Journal of Human Genetics
206 papers in training set
Top 2%
2.4%
12
eLife
5422 papers in training set
Top 34%
2.4%
13
PLOS ONE
4510 papers in training set
Top 47%
2.1%
14
iScience
1063 papers in training set
Top 14%
1.7%
15
Computational and Structural Biotechnology Journal
216 papers in training set
Top 5%
1.7%
16
Briefings in Bioinformatics
326 papers in training set
Top 5%
1.3%
17
G3 Genes|Genomes|Genetics
351 papers in training set
Top 2%
1.2%
18
F1000Research
79 papers in training set
Top 3%
1.2%
19
GigaScience
172 papers in training set
Top 2%
0.9%
20
BMC Bioinformatics
383 papers in training set
Top 6%
0.9%
21
Human Genetics and Genomics Advances
70 papers in training set
Top 0.8%
0.7%
22
Genome Biology
555 papers in training set
Top 8%
0.7%
23
Frontiers in Immunology
586 papers in training set
Top 8%
0.7%
24
Journal of the American Heart Association
119 papers in training set
Top 4%
0.6%
25
European Journal of Human Genetics
49 papers in training set
Top 2%
0.5%
26
Genome Research
409 papers in training set
Top 5%
0.5%