Back

A Systematic Process for Assessing Fitness-for-Purpose of Health Outcomes for Computable Phenotyping with Electronic Health Record Data

Gatto, N. M.; Cronkite, D. J.; Wartko, P. D.; Ball, R.; Carrell, D. S.; Eniafe, R.; Desai, R. M.; Floyd, J. S.; Lee, T.; Nelson, J. C.; Shebl, F. M.; Schoeplein, R.; Toh, S.; Zhang, M.; Dublin, S.; Hernandez-Munoz, J. J.

2025-09-04 pharmacology and therapeutics

10.1101/2025.08.29.25334394 medRxiv

Show abstract

PurposeInformation from electronic health records (EHRs) may be incorporated into computable phenotype algorithms in efforts to overcome inaccuracies of algorithms based on administrative claims data alone. However, such efforts can be resource-intensive and unsuccessful. Assessing the feasibility of computable phenotyping for a health outcome of interest (HOI) before proceeding is therefore recommended. MethodsWe developed a systematic fitness-for-purpose (FFP) assessment process to implement concepts outlined in a previously described general framework for computable phenotyping incorporating EHR data. Our process includes verifying the HOI is well-defined, reviewing clinical information about the HOI, identifying existing algorithms and their performance, evaluating HOI clinical and data complexity, and determining an overall FFP conclusion and recommendation. We applied this process to ten HOIs lacking high-performing claims-based algorithms, selecting HOIs of public health importance that varied in clinical and data complexity, including neutropenia, pericardial effusion and drug-induced liver injury. ResultsHOIs assessed as having moderate (vs. easy) overall difficulty had characteristics such as the need for natural language processing, integration of multiple laboratory test results, or longitudinal EHR data. HOIs assessed as having high difficulty required using data from multiple EHR sources, ruling out many other potential causes, or relying on low-sensitivity diagnostic tests. Input from experts in EHR data and clinical care was crucial. ConclusionEHR data have potential to enhance accuracy of defining certain HOIs for research and surveillance compared to administrative claims data. The process and tools we created will support others in assessing FFP of HOIs for computable phenotyping. Five key pointsO_LIIncorporating electronic health record (EHR) data into computable phenotypes could improve accurate identification of health outcomes of interest (HOIs), but such work can be resource intensive. C_LIO_LIWe developed a systematic fitness-for-purpose (FFP) process and tools to assess the feasibility of computable phenotyping for HOIs. C_LIO_LISteps include identifying existing algorithms and their performance, ensuring the HOI is well-defined, evaluating clinical and data complexity, and determining a feasibility recommendation. C_LIO_LIDifficulty increased with a need for natural language processing, multiple laboratory tests, longitudinal EHR data, multiple EHR sources or ruling out other potential causes. C_LIO_LIInput from EHR data and clinical care experts was crucial to the FFP assessment process. C_LI Plain Language Summary (PLS)Attempts to identify diseases and health conditions by applying computer programs to information easily gleaned from insurance claims of tens of thousands of patients (such as FDAs ongoing safety monitoring of approved drugs or medical products) are often unsuccessful because the data lack nuance. Incorporating information from electronic health records (EHR) and patient chart notes may improve accurate identification of health outcomes. Because this can be resource-intensive, we designed a process and tools to assess the feasibility of including EHR data in computer algorithms to identify health outcomes. Steps included identifying existing algorithms and their performance, building familiarity with the outcome and making sure it is well-defined, evaluating clinical and data complexity, and determining a conclusion about feasibility. We applied our process to ten health outcomes of public health importance. Health outcomes were considered moderately difficult for computerized algorithms if they required natural language processing, integration of multiple laboratory tests, or EHR data from multiple timepoints. Health outcomes having high difficulty required using multiple EHR data types, ruling out many alternative causes of the HOI (other than medications), or relying on diagnostic tests of low accuracy. Input from EHR data and clinical care experts was crucial for the assessment process.

A Systematic Process for Assessing Fitness-for-Purpose of Health Outcomes for Computable Phenotyping with Electronic Health Record Data

Matching journals