Back

Rare germline genetic variation in PAX8 transcription factor binding sites and susceptibility to epithelial ovarian cancer

Ezquina, S. A. M.; Jones, M.; Dicks, E.; de Vries, A.; Peng, P.-C.; Corona, R. I.; Lawrenson, K.; Tyrer, J. P.; Hazelett, D.; Brenton, J. D.; Antoniou, A. C.; Gayther, S. A.; Pharoah, P. D. P.

2023-03-22 genetic and genomic medicine
10.1101/2023.03.22.23287587 medRxiv
Show abstract

Common genetic variation throughout the genome together with rare coding variants identified to date explain about a half of the inherited genetic component of epithelial ovarian cancer risk. It is likely that rare variation in the non-coding genome will explain some of the unexplained heritability, but identifying such variants is challenging. The primary problem is lack of statistical power to identifying individual risk variants by association as power is a function of sample size, effect size and allele frequency. Power can be increased by using burden tests which test for association of carriers of any variant in a specified genomic region. This has the effect of increasing the putative effect allele frequency. PAX8 is a transcription factor that plays a critical role in tumour progression, migration and invasion. Furthermore, regulatory elements proximal to target genes of PAX8 are enriched for common ovarian cancer risk variants. We hypothesised that rare variation in PAX8 binding sites are also associated with ovarian cancer risk, but unlikely to be associated with risk of breast, colorectal or endometrial cancer. We have used publicly-available, whole-genome sequencing data from the UK 100,000 Genomes Project to evaluate the burden of rare variation in PAX8 binding sites across the genome. Data were available for 522 ovarian cancers, 2560 breast cancers, 2465 colorectal cancers and 729 endometrial cancers and 2253 non-cancer controls. Active binding sites were defined using data from multiple PAX8 and H3K27 ChIPseq experiments. We found no association between the burden of rare variation in PAX8 binding sites (defined in several ways) and risk of ovarian, breast or endometrial cancer. An apparent association with colorectal cancer was likely to be a technical artefact as a similar association was also detected for rare variation in random regions of the genome. Despite the null result this study provides a proof-of -principle for using burden testing to identify rare, non-coding germline genetic variation associated with disease. Larger sample sizes available from large-scale sequencing projects together with improved understanding of the function of the non-coding genome will increase the potential of similar studies in the future.

Matching journals

The top 6 journals account for 50% of the predicted probability mass.

1
Journal of Medical Genetics
28 papers in training set
Top 0.1%
28.6%
2
PLOS Genetics
756 papers in training set
Top 2%
6.5%
3
Cancer Epidemiology, Biomarkers & Prevention
17 papers in training set
Top 0.1%
4.5%
4
Scientific Reports
3102 papers in training set
Top 26%
4.5%
5
The American Journal of Human Genetics
206 papers in training set
Top 1.0%
4.5%
6
Frontiers in Genetics
197 papers in training set
Top 1%
4.5%
50% of probability mass above
7
European Journal of Human Genetics
49 papers in training set
Top 0.2%
4.5%
8
Nature Communications
4913 papers in training set
Top 36%
4.1%
9
Cancers
200 papers in training set
Top 2%
2.4%
10
eLife
5422 papers in training set
Top 39%
1.8%
11
F1000Research
79 papers in training set
Top 1%
1.8%
12
Genomics
60 papers in training set
Top 0.9%
1.8%
13
GENETICS
189 papers in training set
Top 0.6%
1.8%
14
PLOS ONE
4510 papers in training set
Top 52%
1.8%
15
Genetics in Medicine
69 papers in training set
Top 0.7%
1.4%
16
Frontiers in Molecular Biosciences
100 papers in training set
Top 3%
1.3%
17
npj Genomic Medicine
33 papers in training set
Top 0.6%
1.1%
18
Human Molecular Genetics
130 papers in training set
Top 3%
0.9%
19
Genome Medicine
154 papers in training set
Top 7%
0.8%
20
Genetic Epidemiology
46 papers in training set
Top 0.8%
0.8%
21
iScience
1063 papers in training set
Top 30%
0.8%
22
Human Reproduction
18 papers in training set
Top 0.4%
0.8%
23
Human Mutation
29 papers in training set
Top 0.7%
0.8%
24
International Journal of Epidemiology
74 papers in training set
Top 2%
0.8%
25
Frontiers in Bioinformatics
45 papers in training set
Top 0.8%
0.8%
26
Bioinformatics
1061 papers in training set
Top 9%
0.8%
27
European Journal of Cancer
10 papers in training set
Top 0.5%
0.7%
28
PLOS Computational Biology
1633 papers in training set
Top 27%
0.7%
29
Cell Genomics
162 papers in training set
Top 7%
0.7%
30
Genes
126 papers in training set
Top 4%
0.7%