Back

Estimating Chronic Kidney Disease Stage Transitions from Irregular Electronic Health Record Data Using an Expectation-Maximization Framework

Qi, W.; Lobo, J. M.; Yan, G.; Ghenbot, R.; Sands, K. G.; Krupski, T. L.; Culp, S. H.; Otero-Leon, D.

2026-03-09 urology
10.64898/2026.03.08.26347890 medRxiv
Show abstract

ObjectiveTo estimate chronic kidney disease (CKD) stage transition probabilities in patients with small renal masses (SRMs) using irregularly observed electronic health record (EHR) data, addressing challenges of interval censoring and irregular measurement intervals in real-world clinical practice. Data SourcesWe used EHR data from the University of Virginia Small Renal Mass (SRM) registry (2006-January 2026), capturing outpatient renal function data prior to any definitive treatment. CKD stages were defined using estimated glomerular filtration rate (eGFR) thresholds based on KDIGO guidelines. Study DesignThe final analytic cohort included 527 patients with at least two outpatient eGFR measurements prior to definitive treatment. We applied an expectation-maximization (EM) algorithm to estimate discrete-time CKD stage transition matrices while accounting for irregular follow-up and unobserved intermediate transitions. Transition matrices were estimated under 3-and 6-month cycle lengths overall as well as stratified by age and sex. Likelihood ratio tests were used to compare EM-based estimates with a naive one-step counting estimator. ResultsThe EM framework yielded clinically plausible transition structures dominated by self-transitions and progression primarily to adjacent CKD stages, with reduced spurious backward transitions relative to the naive estimator. Transition patterns were consistent across 3- and 6-month cycle lengths. Age-stratified analyses showed that older patients had slightly higher probabilities of progression to more advanced CKD stages compared with younger patients, whereas sex-stratified differences were minimal. Likelihood ratio comparisons supported the consistency of the EM-based models with the observed transition data in both the overall cohort and subgroup analyses. ConclusionsThe EM approach provides a principled and computationally efficient method for estimating CKD stage progression from irregularly observed EHR data, yielding transition matrices suitable for discrete-time decision-analytic and health economic models.

Matching journals

The top 6 journals account for 50% of the predicted probability mass.

1
Journal of the American Society of Nephrology
52 papers in training set
Top 0.1%
15.1%
2
Journal of the American Medical Informatics Association
61 papers in training set
Top 0.2%
12.9%
3
PLOS ONE
4510 papers in training set
Top 17%
10.4%
4
Nature Communications
4913 papers in training set
Top 32%
5.0%
5
BMC Medical Research Methodology
43 papers in training set
Top 0.1%
5.0%
6
PLOS Digital Health
91 papers in training set
Top 0.5%
4.4%
50% of probability mass above
7
Kidney360
22 papers in training set
Top 0.2%
4.3%
8
Scientific Reports
3102 papers in training set
Top 34%
3.7%
9
JAMA Network Open
127 papers in training set
Top 2%
1.9%
10
BMJ Open
554 papers in training set
Top 8%
1.8%
11
PLOS Computational Biology
1633 papers in training set
Top 16%
1.7%
12
Kidney International Reports
14 papers in training set
Top 0.2%
1.7%
13
Frontiers in Medicine
113 papers in training set
Top 4%
1.5%
14
BMC Nephrology
13 papers in training set
Top 0.2%
1.4%
15
Journal of Clinical Medicine
91 papers in training set
Top 4%
1.4%
16
Bioinformatics
1061 papers in training set
Top 8%
1.3%
17
Science Translational Medicine
111 papers in training set
Top 4%
1.1%
18
BMC Medical Informatics and Decision Making
39 papers in training set
Top 2%
1.0%
19
Diabetologia
36 papers in training set
Top 0.8%
0.9%
20
The Lancet Digital Health
25 papers in training set
Top 0.8%
0.9%
21
American Journal of Epidemiology
57 papers in training set
Top 1%
0.8%
22
CMAJ Open
12 papers in training set
Top 0.2%
0.8%
23
Annals of Internal Medicine
27 papers in training set
Top 0.9%
0.8%
24
Medical Decision Making
10 papers in training set
Top 0.3%
0.8%
25
JMIR Public Health and Surveillance
45 papers in training set
Top 4%
0.8%
26
Statistics in Medicine
34 papers in training set
Top 0.3%
0.8%
27
Pharmacoepidemiology and Drug Safety
13 papers in training set
Top 0.4%
0.8%
28
PLOS Medicine
98 papers in training set
Top 4%
0.8%
29
npj Digital Medicine
97 papers in training set
Top 4%
0.7%
30
BMC Medicine
163 papers in training set
Top 8%
0.7%