Back

Can Large Language Models Reduce the Cost of Extracting Data from Electronic Health Records for Research?

2026-01-11 health informatics Title + abstract only
View on medRxiv
Show abstract

ObjectiveMuch medical data is only available in unstructured electronic health records (EHR). These data can be obtained through manual (human) extraction or programmatic natural language processing (NLP) methods. We estimate that NLP only becomes economically competitive with manual extraction when there are ~6500 EHRs records. We have found that there is interest from clinicians and researchers in using NLP on projects with fewer records. We examine whether a large language model (LLM) can be ...

Predicted journal destinations