Back

SPELL-LLMs: A Scalable and Privacy-Compliant NLP Pipeline Using Locally Hosted Large Language Models for Clinical Information Extraction

Kleinlein, R.; Gray, K. J.; Bates, D.; Kovacheva, V. P.

2025-07-25 health informatics
10.1101/2025.07.25.25332130 medRxiv
Show abstract

ObjectiveElectronic health records (EHRs) contain valuable information for clinical research and decision-making. However, leveraging these data remains challenging due to data heterogeneity, inconsistent documentation, missing information, and evolving terminology, especially within unstructured clinical notes. We developed SPELL (Snippet-Primed rEgex LLM Pipeline), a scalable natural language processing (NLP) workflow to systematically extract structured clinical insights from large volumes of clinical narratives. Materials and MethodsOur platform employs a hybrid approach combining regular expressions (regex) to rapidly identify relevant textual snippets with locally hosted large language models (LLMs) for accurate clinical interpretation. All data processing occurs securely within institutional computational environments. The modular Python-based workflow facilitates adaptation across institutions and is optimized for computational efficiency, supporting high-throughput processing even in resource-limited settings. We quantified computational scalability (elapsed time, out-of-memory events, GPU temperature, and energy consumed) and audited retrieval recall using clinician-annotated regex-negative notes enriched with relevant structured metadata. ResultsThe pipeline efficiently processed 31 million clinical reports spanning 1976-2024 from eight affiliated hospitals. By analyzing targeted snippets rather than entire documents, our approach reduced processing time by 68% compared to traditional full-document LLM inference, and by >95% compared to manual physician annotation. Accuracy was rigorously validated across three obstetric tasks: extraction of numerical values (blood loss volumes), dates (estimated due dates), and diagnoses (hemolysis, elevated liver enzymes, and low platelets [HELLP] syndrome). Task-level performance included 94-98% exact-match accuracy for the three concepts on curated snippets. Generalizability was investigated using the publicly available MT Samples corpus (5,013 notes, 40 specialties), yielding 84% accuracy for ventricular tachycardia detection with markedly fewer false positives. Discussion and ConclusionsA hybrid regex[->]snippet[->]LLM approach delivers accurate, privacy-preserving, and computationally efficient extraction for unstructured EHR data. By focusing inference on snippets and deploying local, open-weights models, SPELL aligns with institutional data governance requirements while enabling scalable clinical informatics studies across diverse extraction tasks. Summary StatementWe developed SPELL, a scalable NLP pipeline combining regex and locally hosted LLMs for efficient information extraction from clinical narratives.

Matching journals

The top 2 journals account for 50% of the predicted probability mass.

1
Journal of the American Medical Informatics Association
61 papers in training set
Top 0.1%
43.5%
2
Journal of Biomedical Informatics
45 papers in training set
Top 0.1%
15.0%
50% of probability mass above
3
npj Digital Medicine
97 papers in training set
Top 0.6%
8.8%
4
JAMIA Open
37 papers in training set
Top 0.5%
3.2%
5
Journal of Medical Internet Research
85 papers in training set
Top 2%
2.2%
6
JCO Clinical Cancer Informatics
18 papers in training set
Top 0.4%
1.9%
7
International Journal of Medical Informatics
25 papers in training set
Top 0.7%
1.9%
8
Scientific Reports
3102 papers in training set
Top 56%
1.8%
9
BMC Medical Informatics and Decision Making
39 papers in training set
Top 1%
1.8%
10
PLOS Digital Health
91 papers in training set
Top 1%
1.7%
11
European Heart Journal - Digital Health
15 papers in training set
Top 0.4%
1.6%
12
Frontiers in Digital Health
20 papers in training set
Top 0.7%
1.6%
13
BMJ Health & Care Informatics
13 papers in training set
Top 0.5%
1.4%
14
JMIR Medical Informatics
17 papers in training set
Top 1%
1.0%
15
Med
38 papers in training set
Top 0.7%
0.8%
16
iScience
1063 papers in training set
Top 30%
0.8%
17
BMC Medical Research Methodology
43 papers in training set
Top 1%
0.8%
18
Cureus
67 papers in training set
Top 5%
0.7%
19
Bioinformatics
1061 papers in training set
Top 10%
0.7%
20
The Lancet Digital Health
25 papers in training set
Top 1%
0.5%
21
Inflammatory Bowel Diseases
15 papers in training set
Top 0.3%
0.5%