Back

Comparative fine-mapping of breast cancer susceptibility loci using summary statistics methods and multinomial regression

O'Mahony, D. G.; Beasley, J.; Zanti, M.; Dennis, J.; Dutta, D.; Kraft, P.; Kristensen, V.; Chenevix-Trench, G.; Easton, D. F.; Michailidou, K.

2026-04-22 epidemiology
10.64898/2026.04.21.26351364 medRxiv
Show abstract

Summary statistics fine-mapping methods offer advantages over classical methods, including avoiding data-sharing constraints and improved modelling of correlated variables and sparse effects. However, its performance has not been comprehensively evaluated in breast cancer using real-world data. Previous multinomial stepwise regression (MNR) fine-mapping analyses for breast cancer identified 196 credible sets. Here, we apply summary statistics fine-mapping, compare methods, and assess parameters influencing performance. Using summary statistics from the Breast Cancer Association Consortium, we compared finiMOM, SuSiE, and FINEMAP to published MNR results across 129 regions. Performance was assessed by recall using in-sample and out-of-sample LD. Discordant credible sets were examined for technical factors, and target genes were defined using the INQUISIT pipeline. SuSiE showed the closest agreement with MNR. Results varied across regions depending on the assumed number of causal variants (L), with higher values reducing recall and no single L maximising performance. At optimal L per region, SuSiE identified 8,192 CCVs in 244 credible sets, with recall of 88%, 86%, and 72% for overall, ER-positive, and ER-negative breast cancer. Thirty MNR sets were missed. Discordance was partially explained by allele flips, imputation quality, and array heterogeneity. Fifty-two MNR-identified genes, including BRCA2, WNT7B and CREBBP were not recovered, while additional candidate genes were identified. Using out-of-sample LD reduced recall by 3% but identified novel variants. Fine-mapping results vary across methods, and no single approach is sufficient. The choice of L strongly influences results, and combining analytical approaches with functional validation can improve causal variant identification.

Matching journals

The top 7 journals account for 50% of the predicted probability mass.

1
Nature Communications
4913 papers in training set
Top 14%
12.4%
2
Cancer Epidemiology, Biomarkers & Prevention
17 papers in training set
Top 0.1%
10.0%
3
PLOS Computational Biology
1633 papers in training set
Top 5%
7.1%
4
The American Journal of Human Genetics
206 papers in training set
Top 0.6%
7.1%
5
Breast Cancer Research
32 papers in training set
Top 0.1%
6.3%
6
International Journal of Epidemiology
74 papers in training set
Top 0.3%
6.3%
7
Scientific Reports
3102 papers in training set
Top 19%
6.3%
50% of probability mass above
8
PLOS Genetics
756 papers in training set
Top 3%
4.8%
9
Genetic Epidemiology
46 papers in training set
Top 0.2%
3.6%
10
npj Genomic Medicine
33 papers in training set
Top 0.1%
3.6%
11
eLife
5422 papers in training set
Top 31%
2.7%
12
Genome Medicine
154 papers in training set
Top 3%
2.4%
13
Nature Genetics
240 papers in training set
Top 3%
2.1%
14
PLOS ONE
4510 papers in training set
Top 51%
1.9%
15
BMC Genomics
328 papers in training set
Top 2%
1.9%
16
American Journal of Epidemiology
57 papers in training set
Top 0.7%
1.7%
17
Bioinformatics
1061 papers in training set
Top 7%
1.7%
18
Cell Genomics
162 papers in training set
Top 4%
1.5%
19
eBioMedicine
130 papers in training set
Top 3%
0.9%
20
Human Molecular Genetics
130 papers in training set
Top 3%
0.8%
21
BMC Bioinformatics
383 papers in training set
Top 7%
0.7%
22
JNCI Cancer Spectrum
10 papers in training set
Top 0.5%
0.7%
23
Human Genetics and Genomics Advances
70 papers in training set
Top 0.8%
0.7%
24
International Journal of Cancer
42 papers in training set
Top 2%
0.6%