Back

Navigating the Privacy-Accuracy Tradeoff: Federated Survival Analysis with Binning and Differential Privacy

Gouthamchand, V.; van Soest, J.; Arcuri, G.; Dekker, A.; Damiani, A.; Wee, L.

2024-10-09 health informatics
10.1101/2024.10.09.24315159 medRxiv
Show abstract

Federated learning (FL) offers a decentralized approach to model training, allowing for data-driven insights while safeguarding patient privacy across institutions. In the Personal Health Train (PHT) paradigm, it is local model gradients from each institution, aggregated over a sample size of its own patients that are transmitted to a central server to be globally merged, rather than transmitting the patient data itself. However, certain attacks on a PHT infrastructure may risk compromising sensitive data. This study delves into the privacy-accuracy tradeoff in federated Cox Proportional Hazards (CoxPH) models for survival analysis by assessing two Privacy-Enhancing Techniques (PETs) added on top of the PHT approach. In one, we implemented a Discretized Cox model by grouping event times into finite bins to hide individual time-to-event data points. In another, we explored Local Differential Privacy by introducing noise to local model gradients. Our results demonstrate that both strategies can effectively mitigate privacy risks without significantly compromising numerical accuracy, reflected in only small variations of hazard ratios and cumulative baseline hazard curves. Our findings highlight the potential for enhancing privacy-preserving survival analysis within a PHT implementation and suggest practical solutions for multi-institutional research while mitigating the risk of re-identification attacks.

Matching journals

The top 4 journals account for 50% of the predicted probability mass.

1
npj Digital Medicine
97 papers in training set
Top 0.2%
18.7%
2
Patterns
70 papers in training set
Top 0.1%
14.4%
3
Scientific Reports
3102 papers in training set
Top 6%
10.1%
4
PLOS Digital Health
91 papers in training set
Top 0.3%
6.8%
50% of probability mass above
5
Journal of the American Medical Informatics Association
61 papers in training set
Top 0.6%
4.3%
6
Nature Communications
4913 papers in training set
Top 35%
4.3%
7
IEEE Journal of Biomedical and Health Informatics
34 papers in training set
Top 0.4%
4.0%
8
Journal of Biomedical Informatics
45 papers in training set
Top 0.4%
3.7%
9
PLOS ONE
4510 papers in training set
Top 42%
3.1%
10
Journal of Medical Internet Research
85 papers in training set
Top 2%
2.1%
11
JMIR Medical Informatics
17 papers in training set
Top 0.6%
1.9%
12
BMC Medical Informatics and Decision Making
39 papers in training set
Top 1%
1.8%
13
PLOS Computational Biology
1633 papers in training set
Top 17%
1.7%
14
Frontiers in Artificial Intelligence
18 papers in training set
Top 0.3%
1.5%
15
JAMIA Open
37 papers in training set
Top 1%
1.1%
16
BMC Medical Research Methodology
43 papers in training set
Top 0.9%
1.1%
17
Bioinformatics
1061 papers in training set
Top 8%
1.0%
18
Nature Computational Science
50 papers in training set
Top 2%
0.7%
19
Communications Medicine
85 papers in training set
Top 1%
0.7%
20
Artificial Intelligence in the Life Sciences
11 papers in training set
Top 0.2%
0.7%
21
iScience
1063 papers in training set
Top 32%
0.7%
22
Physical Biology
43 papers in training set
Top 3%
0.6%
23
Communications Biology
886 papers in training set
Top 29%
0.6%
24
International Journal of Medical Informatics
25 papers in training set
Top 2%
0.6%