Back

Real-world EHR-derived progression-free survival across successive lines of therapy informs metastatic breast cancer risk stratification

Zhao, X.; Niederhauser, T.; Balazs, Z.; Wicki, A.; Fan, B.; Krauthammer, M.

2026-03-02 health informatics
10.64898/2026.02.24.26346242
Show abstract

Guideline-based recommendations for metastatic lines of therapy (mLoTs), especially second lines and beyond, are comparatively sparse due to challenges in later-line treatment efficacy quantification. Scalable real-world evidence that captures the interaction between treatment and disease progression is therefore especially valuable, as regimens become increasingly individualized, confounding intensifies, and progression is rarely recorded as a structured EHR endpoint. We present a framework to (i) reconstruct clinically coherent mLoTs from longitudinal EHR using radiology-anchored progression evidence and (ii) generate individualized progression-free survival (PFS) estimates from a line-start multimodal snapshot in a highly heterogeneous cohort. In 2,881 patients contributing 8,791 metastatic mLoTs, the selected model shows strong discrimination over a 2-year horizon (Antolinis C = 0.680 {+/-} 0.006; cumulative/dynamic AUC at 1 year = 0.824 {+/-} 0.006). Predicted risk strata closely track Kaplan-Meier trends across line number and tumor subtypes, enabling calibrated risk stratification even in smaller sub-cohorts. Model prediction primarily relies on clinically plausible signals of recent metastatic burden and tumor markers, with limited dependence on surveillance cadence or subtype labels, and is robust to missingness. Together, this framework supports scalable evidence generation and interpretable, calibrated prognostication to inform risk assessment and care planning in heterogeneous metastatic practice.

Matching journals

The top 7 journals account for 50% of the predicted probability mass.

1
npj Digital Medicine
based on 85 papers
Top 0.9%
15.7%
2
Nature Communications
based on 483 papers
Top 6%
11.4%
3
JCO Clinical Cancer Informatics
based on 14 papers
Top 0.2%
7.7%
4
Journal of the American Medical Informatics Association
based on 53 papers
Top 2%
5.4%
5
JAMA Network Open
based on 125 papers
Top 3%
4.8%
6
BMC Medical Informatics and Decision Making
based on 36 papers
Top 3%
4.6%
7
JAMIA Open
based on 35 papers
Top 3%
3.0%
50% of probability mass above
8
The Lancet Digital Health
based on 25 papers
Top 0.6%
2.9%
9
Nature Medicine
based on 88 papers
Top 5%
2.4%
10
eLife
based on 262 papers
Top 12%
2.4%
11
Journal of Biomedical Informatics
based on 37 papers
Top 3%
2.3%
12
BMJ Health & Care Informatics
based on 13 papers
Top 1%
2.3%
13
Cancer Epidemiology, Biomarkers & Prevention
based on 14 papers
Top 2%
1.6%
14
Science Translational Medicine
based on 40 papers
Top 2%
1.6%
15
Breast Cancer Research
based on 11 papers
Top 0.7%
1.6%
16
Scientific Reports
based on 701 papers
Top 72%
1.6%
17
PLOS ONE
based on 1737 papers
Top 91%
1.4%
18
eBioMedicine
based on 82 papers
Top 4%
1.3%
19
Cancers
based on 57 papers
Top 6%
1.2%
20
PLOS Digital Health
based on 88 papers
Top 10%
1.2%
21
JMIR Medical Informatics
based on 16 papers
Top 5%
0.8%
22
Frontiers in Digital Health
based on 18 papers
Top 4%
0.8%
23
Communications Medicine
based on 63 papers
Top 3%
0.8%
24
Annals of Internal Medicine
based on 27 papers
Top 2%
0.8%
25
Journal of Medical Internet Research
based on 81 papers
Top 14%
0.8%
26
BMJ Open
based on 553 papers
Top 53%
0.7%
27
Clinical Cancer Research
based on 22 papers
Top 4%
0.7%
28
BMC Medical Research Methodology
based on 41 papers
Top 6%
0.7%