Back

LLM-Driven Target Trial Emulation with Human-in-the-Loop Validation for Randomized Trial: Automated Protocol Extraction and Real-World Outcome Evaluation{Psi}

Dey, S. K.; Qureshi, A. I.; Shyu, C.-R.

2026-04-13 health informatics
10.64898/2026.04.09.26350523 medRxiv
Show abstract

Target trial emulation (TTE) enables causal inference from observational data but remains bottlenecked by manual, expert-dependent protocol operationalization. While large language models (LLMs) have advanced clinical knowledge extraction and code generation, their ability to automate end-to-end TTE workflows remains largely unexplored. We present an LLM-driven framework using retrieval-augmented generation to extract the five core TTE design parameters from the Carotid Revascularization and Medical Management for Asymptomatic Carotid Stenosis Trial (CREST-2) protocol and generate executable phenotyping pipelines for real-world EHR data. The performance of the framework was evaluated along two dimensions. First, protocol extraction accuracy was assessed against a gold-standard checklist of trial design components using precision, recall, and F1-score metrics. Second, outcome validity was evaluated through population-level concordance analyses comparing EHR-derived outcomes with published trial endpoints using standardized mean difference, observed-to-expected ratios, confidence interval overlap, and two-proportion z-tests. Further, Human-in-the-loop validation assessed the correctness of extracted clinical logic and phenotype definitions. Together, these evaluations demonstrate a structured approach for assessing LLM-driven protocol-to-pipeline translation for scalable real-world evidence generation.

Matching journals

The top 2 journals account for 50% of the predicted probability mass.

1
npj Digital Medicine
97 papers in training set
Top 0.1%
42.3%
2
Journal of the American Medical Informatics Association
61 papers in training set
Top 0.2%
10.8%
50% of probability mass above
3
Journal of Biomedical Informatics
45 papers in training set
Top 0.2%
7.7%
4
Scientific Reports
3102 papers in training set
Top 32%
3.9%
5
Nature Communications
4913 papers in training set
Top 46%
2.2%
6
Med
38 papers in training set
Top 0.1%
2.2%
7
JCO Clinical Cancer Informatics
18 papers in training set
Top 0.4%
2.0%
8
JAMIA Open
37 papers in training set
Top 0.7%
1.9%
9
European Heart Journal - Digital Health
15 papers in training set
Top 0.3%
1.8%
10
Nature Biomedical Engineering
42 papers in training set
Top 0.9%
1.6%
11
PLOS Digital Health
91 papers in training set
Top 2%
1.4%
12
Science Translational Medicine
111 papers in training set
Top 4%
1.2%
13
BMC Medical Informatics and Decision Making
39 papers in training set
Top 2%
1.0%
14
Annals of Internal Medicine
27 papers in training set
Top 0.7%
1.0%
15
PLOS ONE
4510 papers in training set
Top 65%
0.8%
16
Bioinformatics
1061 papers in training set
Top 9%
0.8%
17
The Lancet Digital Health
25 papers in training set
Top 0.9%
0.8%
18
Patterns
70 papers in training set
Top 2%
0.8%
19
JMIR Medical Informatics
17 papers in training set
Top 1%
0.8%
20
Nature Medicine
117 papers in training set
Top 4%
0.8%
21
Scientific Data
174 papers in training set
Top 3%
0.7%
22
Briefings in Bioinformatics
326 papers in training set
Top 7%
0.7%
23
Communications Medicine
85 papers in training set
Top 2%
0.5%
24
Clinical and Translational Science
21 papers in training set
Top 1%
0.5%
25
PLOS Computational Biology
1633 papers in training set
Top 28%
0.5%
26
The American Journal of Human Genetics
206 papers in training set
Top 5%
0.5%