Back

FAMES: Federated additive model using piecewise exponential survival data

Islam, N.; Luo, C.; Tong, J.; Weller, G.; Polleya, D. A.; Kent, A.; Bair, S.

2026-05-19 health informatics
10.64898/2026.05.15.26353335 medRxiv
Show abstract

Introduction In analyses of time-to-event data, clinical characteristics can have non-linear impacts on survival outcomes, and understanding this dynamic behavior is crucial for producing real-world evidence (RWE). Nonetheless, estimating these dynamic effects is inherently challenging when utilizing real-world data (RWD), especially since sharing individual-level patient data (IPD) is heavily restricted due to regulatory limitations. Additionally, computational difficulties are exacerbated by the high dimensionality, inter-dependency, rarity, sparsity, and scarcity of features. While data augmentation through collaboration across multiple sites might address these challenges, such collaboration is often infeasible and hindered by regulatory measures that protect patient privacy, thereby preventing the sharing of IPD between sites. Objectives To address this challenge, we propose a privacy-preserving regularized algorithm that eliminates the necessity of aggregating any protected health information across sites. This algorithm employs a penalized federated additive model utilizing piecewise exponential survival (FAMES) data and estimates non-linear effects of features while accounting for non-varying confounding effects. The model is flexible and can accommodate both multiple and multivariate smooth effects simultaneously. Methods The proposed model transforms survival data into a piecewise exponential data (PED) structure and casts the semi-parametric optimization problem into a generalized additive modeling framework assuming Poisson distribution. The model uses orthonormal splines to approximate non-linear effects and incorporates L2-norm based penalty terms to control the smoothness and goodness-of-fit of these effects. The algorithm is optimized using site-specific aggregated summary statistics and is solved iteratively through the Newton-Raphson method. Results The model is employed to assess the smooth effects of clinical features, such as age and numeric laboratory values, on overall survival using RWD from approximately 874 newly diagnosed Acute Myeloid Leukemia (AML) patients treated at seven distinct sites in the United States. The model exhibited non-linear smooth effects for lactate dehydrogenase, platelets, and others underscoring their strong association with disease prognosis. The model demonstrates a lossless property, providing estimates of smooth and fixed effects that are comparable to those derived from the pooled PED. Additionally, the inference of parameters for testing the nullity of effects remains consistent. This model is communication-efficient, necessitating roughly twelve rounds of communication across sites. Conclusion We anticipate that this model can facilitate multisite collaboration and enable smaller sites to participate in generating and validating RWE, especially for rare diseases. While the model was applied within the context of AML, it is disease-agnostic and can be implemented in any other clinical context and across various sites globally without losing any generality.

Matching journals

The top 5 journals account for 50% of the predicted probability mass.

1
Journal of the American Medical Informatics Association
61 papers in training set
Top 0.2%
14.8%
2
JCO Clinical Cancer Informatics
18 papers in training set
Top 0.1%
10.1%
3
PLOS ONE
4510 papers in training set
Top 18%
10.1%
4
BMC Medical Informatics and Decision Making
39 papers in training set
Top 0.3%
8.4%
5
BMC Medical Research Methodology
43 papers in training set
Top 0.1%
7.2%
50% of probability mass above
6
Scientific Reports
3102 papers in training set
Top 36%
3.6%
7
Bioinformatics
1061 papers in training set
Top 5%
3.6%
8
International Journal of Medical Informatics
25 papers in training set
Top 0.4%
3.3%
9
Journal of Biomedical Informatics
45 papers in training set
Top 0.6%
2.7%
10
Bulletin of Mathematical Biology
84 papers in training set
Top 0.7%
2.6%
11
Computers in Biology and Medicine
120 papers in training set
Top 2%
2.1%
12
BMC Bioinformatics
383 papers in training set
Top 4%
2.1%
13
JAMIA Open
37 papers in training set
Top 0.7%
1.9%
14
Frontiers in Artificial Intelligence
18 papers in training set
Top 0.3%
1.7%
15
Journal of Medical Internet Research
85 papers in training set
Top 3%
1.7%
16
Computer Methods and Programs in Biomedicine
27 papers in training set
Top 0.4%
1.7%
17
PLOS Computational Biology
1633 papers in training set
Top 18%
1.5%
18
Informatics in Medicine Unlocked
21 papers in training set
Top 0.7%
1.2%
19
Statistics in Medicine
34 papers in training set
Top 0.3%
0.8%
20
Biology Methods and Protocols
53 papers in training set
Top 2%
0.8%
21
Biometrics
22 papers in training set
Top 0.2%
0.7%
22
PLOS Digital Health
91 papers in training set
Top 3%
0.7%
23
Biomedicines
66 papers in training set
Top 3%
0.7%
24
Clinical and Translational Science
21 papers in training set
Top 1%
0.7%
25
iScience
1063 papers in training set
Top 32%
0.7%
26
Cancer Research Communications
46 papers in training set
Top 1%
0.7%
27
Blood
67 papers in training set
Top 1%
0.7%
28
Cancers
200 papers in training set
Top 5%
0.7%
29
JMIR Medical Informatics
17 papers in training set
Top 2%
0.7%
30
Physical Biology
43 papers in training set
Top 2%
0.7%