Back

PyTMLE: A Flexible Python Library for Targeted Estimation of Survival and Competing Risks using Causal Machine Learning

Guski, J.; Aborageh, M.; Fröhlich, H.

2025-07-03 health informatics
10.1101/2025.07.02.25330730 medRxiv
Show abstract

BackgroundTargeted estimation offers a robust and unbiased approach for causal inference of the average treatment effect (ATE) from observational data, even with confounding, dependent censoring, and competing risks. Its advantages include double robustness, statistical rigor, and flexible data-adaptive modeling, potentially leveraging machine/deep learning. However, existing implementations lack model selection flexibility and are R-based, hindering adoption by the Python-focused machine learning community. ResultsWe propose PyTMLE, a flexible Python package for causal machine learning-based targeted estimation with survival outcomes and competing risks. PyTMLE supports scikit-survival and pycox, and inbuilt robustness checks based on E-values. PyTMLE is easy to use with initial estimation of nuisance parameters that are obtained via super learning by default. We showcase its basic usage on the established Hodgkins disease dataset, where our package reveals the protective effect of chemotherapy on relapse risk. ConclusionsThis package promotes targeted estimation in time-to-event analysis for applied machine learning, enabling fully data-adaptive nuisance parameter estimation, potentially with deep learning. Future enhancements may include time-dependent confounders and dynamic treatment regimes.

Matching journals

The top 5 journals account for 50% of the predicted probability mass.

1
BMC Medical Research Methodology
43 papers in training set
Top 0.1%
33.1%
2
JCO Clinical Cancer Informatics
18 papers in training set
Top 0.1%
6.4%
3
PLOS ONE
4510 papers in training set
Top 33%
4.4%
4
BMC Bioinformatics
383 papers in training set
Top 2%
4.3%
5
Bioinformatics
1061 papers in training set
Top 5%
4.3%
50% of probability mass above
6
BMC Medical Informatics and Decision Making
39 papers in training set
Top 0.6%
4.3%
7
Journal of Medical Internet Research
85 papers in training set
Top 1%
3.6%
8
Frontiers in Artificial Intelligence
18 papers in training set
Top 0.1%
3.6%
9
International Journal of Medical Informatics
25 papers in training set
Top 0.5%
3.1%
10
Scientific Reports
3102 papers in training set
Top 44%
2.7%
11
Journal of the American Medical Informatics Association
61 papers in training set
Top 1%
1.7%
12
BMC Research Notes
29 papers in training set
Top 0.1%
1.7%
13
JMIR Medical Informatics
17 papers in training set
Top 0.8%
1.5%
14
Nature Communications
4913 papers in training set
Top 53%
1.5%
15
Computer Methods and Programs in Biomedicine
27 papers in training set
Top 0.5%
1.2%
16
JAMIA Open
37 papers in training set
Top 1%
1.0%
17
Informatics in Medicine Unlocked
21 papers in training set
Top 0.9%
0.9%
18
BMC Infectious Diseases
118 papers in training set
Top 5%
0.8%
19
JMIR mHealth and uHealth
10 papers in training set
Top 0.4%
0.8%
20
npj Digital Medicine
97 papers in training set
Top 3%
0.8%
21
Pharmacoepidemiology and Drug Safety
13 papers in training set
Top 0.4%
0.7%
22
Clinical Cancer Research
58 papers in training set
Top 2%
0.7%
23
Biomedicines
66 papers in training set
Top 3%
0.7%
24
Biometrics
22 papers in training set
Top 0.2%
0.7%
25
Statistics in Medicine
34 papers in training set
Top 0.3%
0.7%
26
Database
51 papers in training set
Top 1.0%
0.7%
27
BMJ Open
554 papers in training set
Top 13%
0.7%
28
European Journal of Epidemiology
40 papers in training set
Top 0.9%
0.6%
29
Biology Methods and Protocols
53 papers in training set
Top 3%
0.6%
30
Frontiers in Digital Health
20 papers in training set
Top 2%
0.6%