Clinical encounter heterogeneity and methods for resolving in networked EHR data: A study from N3C and RECOVER programs
Leese, P. J.; Anand, A.; Girvin, A.; Bennett, T.; Hajagos, J.; Patel, S.; Yoo, J.; Pfaff, E.; Moffitt, R.
Show abstract
OBJECTIVEClinical encounter data are heterogeneous and vary greatly from institution to institution. These problems of variance affect interpretability and usability of clinical encounter data for analysis. These problems are magnified when multi-site electronic health record data are networked together. This paper presents a novel, generalizable method for resolving encounter heterogeneity for analysis by combining related atomic encounters into composite macrovisits. MATERIALS AND METHODSEncounters were composed of data from 75 partner sites harmonized to a common data model as part of the NIH Researching COVID to Enhance Recovery Initiative, a project of the National Covid Cohort Collaborative. Summary statistics were computed for overall and site-level data to assess issues and identify modifications. Two algorithms were developed to refine atomic encounters into cleaner, analyzable longitudinal clinical visits. RESULTSAtomic inpatient encounters data were found to be widely disparate between sites in terms of length-of-stay and numbers of OMOP CDM measurements per encounter. After aggregating encounters to macrovisits, length-of-stay (LOS) and measurement variance decreased. A subsequent algorithm to identify hospitalized macrovisits further reduced data variability. DISCUSSIONEncounters are a complex and heterogeneous component of EHR data and native data issues are not addressed by existing methods. These types of complex and poorly studied issues contribute to the difficulty of deriving value from EHR data, and these types of foundational, large-scale explorations and developments are necessary to realize the full potential of modern real world data. CONCLUSIONThis paper presents method developments to manipulate and resolve EHR encounter data issues in a generalizable way as a foundation for future research and analysis.
Matching journals
The top 3 journals account for 50% of the predicted probability mass.