Back

Development and Evaluation of a Digital Scribe: Conversation Summarization Pipeline for Emergency Department Counseling Sessions towards Reducing Documentation Burden

Sezgin, E.; Sirrianni, J.; Kranz, K.

2023-12-07 emergency medicine
10.1101/2023.12.06.23299573 medRxiv
Show abstract

ObjectiveWe present a proof-of-concept digital scribe system as an ED clinical conversation summarization pipeline and report its performance. Materials and MethodsWe use four pre-trained large language models to establish the digital scribe system: T5-small, T5-base, PEGASUS-PubMed, and BART-Large-CNN via zero-shot and fine-tuning approaches. Our dataset includes 100 referral conversations among ED clinicians and medical records. We report the ROUGE-1, ROUGE-2, and ROUGE-L to compare model performance. In addition, we annotated transcriptions to assess the quality of generated summaries. ResultsThe fine-tuned BART-Large-CNN model demonstrates greater performance in summarization tasks with the highest ROUGE scores (F1ROUGE-1=0.49, F1ROUGE-2=0.23, F1ROUGE-L=0.35) scores. In contrast, PEGASUS-PubMed lags notably (F1ROUGE-1=0.28, F1ROUGE-2=0.11, F1ROUGE-L=0.22). BART-Large-CNNs performance decreases by more than 50% with the zero-shot approach. Annotations show that BART-Large-CNN performs 71.4% recall in identifying key information and a 67.7% accuracy rate. DiscussionThe BART-Large-CNN model demonstrates a high level of understanding of clinical dialogue structure, indicated by its performance with and without fine-tuning. Despite some instances of high recall, there is variability in the models performance, particularly in achieving consistent correctness, suggesting room for refinement. The models recall ability varies across different information categories. ConclusionThe study provides evidence towards the potential of AI-assisted tools in reducing clinical documentation burden. Future work is suggested on expanding the research scope with larger language models, and comparative analysis to measure documentation efforts and time.

Matching journals

The top 3 journals account for 50% of the predicted probability mass.

1
Journal of Medical Internet Research
85 papers in training set
Top 0.1%
27.1%
2
Artificial Intelligence in Medicine
15 papers in training set
Top 0.1%
19.5%
3
PLOS ONE
4510 papers in training set
Top 30%
5.1%
50% of probability mass above
4
International Journal of Medical Informatics
25 papers in training set
Top 0.3%
4.1%
5
JAMIA Open
37 papers in training set
Top 0.3%
4.1%
6
npj Digital Medicine
97 papers in training set
Top 1%
4.1%
7
Scientific Reports
3102 papers in training set
Top 33%
3.8%
8
Journal of Biomedical Informatics
45 papers in training set
Top 0.5%
3.2%
9
PLOS Digital Health
91 papers in training set
Top 1.0%
2.6%
10
Frontiers in Digital Health
20 papers in training set
Top 0.6%
1.8%
11
Journal of the American Medical Informatics Association
61 papers in training set
Top 1%
1.6%
12
Healthcare
16 papers in training set
Top 0.7%
1.6%
13
Annals of Translational Medicine
17 papers in training set
Top 0.8%
1.4%
14
Emergency Medicine Journal
20 papers in training set
Top 0.4%
1.3%
15
IEEE Journal of Biomedical and Health Informatics
34 papers in training set
Top 2%
1.0%
16
BMC Medical Informatics and Decision Making
39 papers in training set
Top 2%
0.9%
17
Frontiers in Public Health
140 papers in training set
Top 7%
0.9%
18
BMC Medical Research Methodology
43 papers in training set
Top 1%
0.8%
19
JMIR Formative Research
32 papers in training set
Top 2%
0.8%
20
Computer Methods and Programs in Biomedicine
27 papers in training set
Top 0.9%
0.8%
21
iScience
1063 papers in training set
Top 30%
0.8%
22
BJPsych Open
25 papers in training set
Top 0.7%
0.8%
23
Heliyon
146 papers in training set
Top 6%
0.8%
24
Frontiers in Medicine
113 papers in training set
Top 8%
0.7%
25
Cureus
67 papers in training set
Top 5%
0.7%
26
Medicine
30 papers in training set
Top 3%
0.5%
27
BioMed Research International
25 papers in training set
Top 4%
0.5%
28
Journal of General Internal Medicine
20 papers in training set
Top 1%
0.5%