Estimating Chronic Kidney Disease Stage Transitions from Irregular Electronic Health Record Data Using an Expectation-Maximization Framework
Qi, W.; Lobo, J. M.; Yan, G.; Ghenbot, R.; Sands, K. G.; Krupski, T. L.; Culp, S. H.; Otero-Leon, D.
Show abstract
ObjectiveTo estimate chronic kidney disease (CKD) stage transition probabilities in patients with small renal masses (SRMs) using irregularly observed electronic health record (EHR) data, addressing challenges of interval censoring and irregular measurement intervals in real-world clinical practice. Data SourcesWe used EHR data from the University of Virginia Small Renal Mass (SRM) registry (2006-January 2026), capturing outpatient renal function data prior to any definitive treatment. CKD stages were defined using estimated glomerular filtration rate (eGFR) thresholds based on KDIGO guidelines. Study DesignThe final analytic cohort included 527 patients with at least two outpatient eGFR measurements prior to definitive treatment. We applied an expectation-maximization (EM) algorithm to estimate discrete-time CKD stage transition matrices while accounting for irregular follow-up and unobserved intermediate transitions. Transition matrices were estimated under 3-and 6-month cycle lengths overall as well as stratified by age and sex. Likelihood ratio tests were used to compare EM-based estimates with a naive one-step counting estimator. ResultsThe EM framework yielded clinically plausible transition structures dominated by self-transitions and progression primarily to adjacent CKD stages, with reduced spurious backward transitions relative to the naive estimator. Transition patterns were consistent across 3- and 6-month cycle lengths. Age-stratified analyses showed that older patients had slightly higher probabilities of progression to more advanced CKD stages compared with younger patients, whereas sex-stratified differences were minimal. Likelihood ratio comparisons supported the consistency of the EM-based models with the observed transition data in both the overall cohort and subgroup analyses. ConclusionsThe EM approach provides a principled and computationally efficient method for estimating CKD stage progression from irregularly observed EHR data, yielding transition matrices suitable for discrete-time decision-analytic and health economic models.
Matching journals
The top 6 journals account for 50% of the predicted probability mass.