Back

MedMatch: a first step for the automation of large language model performance benchmarking for medication-related tasks

2026-01-15 health informatics Title + abstract only
View on medRxiv
Show abstract

BackgroundThe accuracy and safety of generating medication orders by large language models (LLMs) must be demonstrated. Without standardization, performance evaluation is limited to time and resource-intensive clinician grading. This evaluation aimed to develop a standardized medication format that supports automated performance evaluation (MedMatch). MethodsFirst, a survey of 40 medication prompts was given to clinicians to assess agreement in medication order communication. Second, a clinicia...

Predicted journal destinations