Back

Using DEPendency of association on the number of Top Hits (DEPTH) as a complementary tool to identify novel risk loci in colorectal cancer

Lai, J.; Wong, C.; Schmidt, D. F.; Kapuscinski, M.; Alpen, K.; MacInnis, R. J.; Buchanan, D. D.; Win, A. K.; Figueiredo, J.; Chan, A. T.; Harrison, T. A.; Hoffmeister, M.; White, E.; Marchand, L. L.; Peters, U.; Hopper, J. L.; Makalic, E.; Jenkins, M. A.

2022-11-27 epidemiology
10.1101/2022.11.24.22282734 medRxiv
Show abstract

BackgroundDEPendency of association on the number of Top Hits (DEPTH) is an approach to identify candidate risk regions by considering the risk signals from over-lapping groups of sequential variants across the genome. MethodsWe conducted a DEPTH analysis using a sliding window of 200 SNPs to colorectal cancer (CRC) data from the Colon Cancer Family Registry (CCFR) (5,735 cases and 3,688 controls), and GECCO (8,865 cases and 10,285 controls) studies. A DEPTH score >1 was used to identify risk regions common to both studies. We compared DEPTH results against those from conventional GWAS analyses of these two studies as well as against 132 published risk regions. ResultsInitial DEPTH analysis revealed 2,622 (CCFR) and 3,686 (GECCO) risk regions, of which 569 were common to both studies. Bootstrapping revealed 40 and 49 likely risk regions in the CCFR and GECCO data sets, respectively. Notably, DEPTH identified at least 82 likely risk regions that would not be detected using conventional GWAS methods, nor had they been identified in previous CRC GWASs. We found four reproducible risk regions (2q22.2, 2q33.1, 6p21.32, 13q14.3), with the HLA locus at 6p21 having the highest DEPTH score. The strongest associated SNPs were rs762216297, rs149490268, rs114741460, and rs199707618 for the CCFR data, and rs9270761 for the GECCO data. ConclusionDEPTH can identify novel likely risk regions for CRC not identified using conventional analyses of much larger datasets. ImpactDEPTH has potential as a powerful complementary tool to conventional GWAS analyses for identifying risk regions within the genome.

Matching journals

The top 4 journals account for 50% of the predicted probability mass.

1
Cancer Epidemiology, Biomarkers & Prevention
17 papers in training set
Top 0.1%
23.1%
2
International Journal of Epidemiology
74 papers in training set
Top 0.1%
19.1%
3
PLOS ONE
4510 papers in training set
Top 24%
7.0%
4
Scientific Reports
3102 papers in training set
Top 16%
6.5%
50% of probability mass above
5
BMC Bioinformatics
383 papers in training set
Top 2%
5.0%
6
Bioinformatics
1061 papers in training set
Top 5%
4.1%
7
Genetic Epidemiology
46 papers in training set
Top 0.3%
2.1%
8
BMC Research Notes
29 papers in training set
Top 0.1%
1.9%
9
International Journal of Cancer
42 papers in training set
Top 0.5%
1.9%
10
PeerJ
261 papers in training set
Top 7%
1.7%
11
F1000Research
79 papers in training set
Top 1%
1.7%
12
BMC Medicine
163 papers in training set
Top 4%
1.5%
13
JNCI Cancer Spectrum
10 papers in training set
Top 0.3%
1.4%
14
Journal of Medical Genetics
28 papers in training set
Top 0.4%
1.0%
15
Epidemiology and Infection
84 papers in training set
Top 2%
0.9%
16
BMC Medical Genomics
36 papers in training set
Top 1%
0.8%
17
BMC Cancer
52 papers in training set
Top 2%
0.8%
18
BioData Mining
15 papers in training set
Top 0.8%
0.8%
19
Journal of Biomedical Informatics
45 papers in training set
Top 1%
0.8%
20
npj Genomic Medicine
33 papers in training set
Top 0.8%
0.8%
21
Cancer Medicine
24 papers in training set
Top 1%
0.7%
22
Journal of Clinical Medicine
91 papers in training set
Top 7%
0.7%
23
British Journal of Cancer
42 papers in training set
Top 2%
0.7%
24
Human Mutation
29 papers in training set
Top 0.8%
0.7%
25
Frontiers in Oncology
95 papers in training set
Top 4%
0.7%
26
JAMA Network Open
127 papers in training set
Top 5%
0.7%
27
PLOS Computational Biology
1633 papers in training set
Top 27%
0.7%
28
Wellcome Open Research
57 papers in training set
Top 3%
0.5%
29
Nucleic Acids Research
1128 papers in training set
Top 21%
0.5%