Back

Using a modular massively parallel reporter assay to discover context-specific regulatory grammars in type 2 diabetes

Tovar, A.; Kyono, Y.; Nishino, K.; Bose, M.; Varshney, A.; Parker, S. C. J.; Kitzman, J. O.

2023-10-10 genomics
10.1101/2023.10.08.561391 bioRxiv
Show abstract

Most genome-wide association signals for complex disease reside in the noncoding genome, where defining function is nontrivial. MPRAs (massively parallel reporter assays) offer a scalable means to identify functional regulatory elements, but are typically conducted without regard to cell type, pairing cloned fragments with a generic housekeeping promoter. To explore the context-sensitivity of MPRAs, we screened enhancer activity across a panel of nearly 12,000 198-bp fragments spanning over 300 type 2 diabetes- and metabolic trait-associated regions in the 832/13 rat insulinoma beta cell line, a relevant model of pancreatic beta cells. We explored these fragments context sensitivity by comparing their activities when placed up- or downstream of a reporter gene, and in combination with either a synthetic housekeeping promoter (SCP1) or a more biologically relevant promoter corresponding to the human insulin (INS) gene. We identified clear effects of MPRA construct design on enhancer activity. Specifically, a subset of fragments (n = 702/11,656) displayed positional bias, evenly distributed across up- and downstream preference. Promoter choice also influenced MPRA activity (n = 698/11,656), mostly biased towards the cell-specific INS promoter (73.4%). To identify sequence features associated with promoter preference, we used Lasso regression with 562 genomic annotations and discovered that fragments with INS promoter-biased activity are enriched for HNF1 motifs. HNF1 family transcription factors are key regulators of glucose metabolism disrupted in maturity onset diabetes of the young (MODY), suggesting genetic convergence between rare coding variants that cause MODY and common T2D-associated regulatory regions. We designed a follow-up MPRA containing HNF1 motif-enriched fragments and observed several instances where deletion or mutation of HNF1 motifs disrupted the INS promoter-biased enhancer activity, specifically in the beta cell model but not in a skeletal muscle cell line, another diabetes-relevant cell type. Together, our study suggests that cell-specific regulatory activity is partially influenced by enhancer-promoter compatibility and indicates that careful attention should be paid when designing MPRA libraries to capture context-specific regulatory processes at disease-associated genetic signals.

Matching journals

The top 5 journals account for 50% of the predicted probability mass.

1
Cell Genomics
162 papers in training set
Top 0.1%
25.7%
2
Nature Genetics
240 papers in training set
Top 0.8%
8.3%
3
Cell Systems
167 papers in training set
Top 2%
6.3%
4
Nature Communications
4913 papers in training set
Top 30%
6.3%
5
Cell Reports
1338 papers in training set
Top 10%
4.8%
50% of probability mass above
6
The American Journal of Human Genetics
206 papers in training set
Top 1%
4.3%
7
Frontiers in Genetics
197 papers in training set
Top 2%
3.6%
8
Genome Medicine
154 papers in training set
Top 2%
3.6%
9
eLife
5422 papers in training set
Top 26%
3.6%
10
Diabetes
53 papers in training set
Top 0.3%
2.6%
11
PLOS Genetics
756 papers in training set
Top 7%
2.1%
12
Nature
575 papers in training set
Top 10%
1.8%
13
Cell
370 papers in training set
Top 11%
1.7%
14
Scientific Reports
3102 papers in training set
Top 62%
1.5%
15
Genome Research
409 papers in training set
Top 3%
1.3%
16
BMC Genomics
328 papers in training set
Top 3%
1.3%
17
Science
429 papers in training set
Top 17%
1.2%
18
Communications Biology
886 papers in training set
Top 15%
1.2%
19
Molecular Systems Biology
142 papers in training set
Top 1.0%
1.2%
20
Nucleic Acids Research
1128 papers in training set
Top 14%
1.1%
21
Genome Biology
555 papers in training set
Top 6%
0.9%
22
Cell Reports Methods
141 papers in training set
Top 5%
0.8%
23
NAR Genomics and Bioinformatics
214 papers in training set
Top 4%
0.7%
24
Bioinformatics Advances
184 papers in training set
Top 5%
0.7%
25
Molecular Cell
308 papers in training set
Top 10%
0.7%
26
Science Advances
1098 papers in training set
Top 32%
0.7%
27
PLOS Computational Biology
1633 papers in training set
Top 26%
0.7%
28
Diabetologia
36 papers in training set
Top 1%
0.7%
29
Bioinformatics
1061 papers in training set
Top 10%
0.6%