Back

Filling the gaps: leveraging large language models for temporal harmonization of clinical text across multiple medical visits for clinical prediction

Choi, I.; Long, Q.; Getzen, E.

2024-05-07 intensive care and critical care medicine
10.1101/2024.05.06.24306959 medRxiv
Show abstract

Electronic health records offer great promise for early disease detection, treatment evaluation, information discovery, and other important facets of precision health. Clinical notes, in particular, may contain nuanced information about a patients condition, treatment plans, and history that structured data may not capture. As a result, and with advancements in natural language processing, clinical notes have been increasingly used in supervised prediction models. To predict long-term outcomes such as chronic disease and mortality, it is often advantageous to leverage data occurring at multiple time points in a patients history. However, these data are often collected at irregular time intervals and varying frequencies, thus posing an analytical challenge. Here, we propose the use of large language models (LLMs) for robust temporal harmonization of clinical notes across multiple visits. We compare multiple state-of-the-art LLMs in their ability to generate useful information during time gaps, and evaluate performance in supervised deep learning models for clinical prediction.

Matching journals

The top 2 journals account for 50% of the predicted probability mass.

1
Journal of Biomedical Informatics
45 papers in training set
Top 0.1%
41.2%
2
npj Digital Medicine
97 papers in training set
Top 0.3%
14.9%
50% of probability mass above
3
Bioinformatics
1061 papers in training set
Top 5%
4.1%
4
Journal of the American Medical Informatics Association
61 papers in training set
Top 0.7%
3.7%
5
Science Translational Medicine
111 papers in training set
Top 1%
3.4%
6
Nature
575 papers in training set
Top 8%
2.8%
7
Proceedings of the National Academy of Sciences
2130 papers in training set
Top 24%
2.8%
8
eBioMedicine
130 papers in training set
Top 0.5%
2.8%
9
iScience
1063 papers in training set
Top 9%
2.2%
10
Scientific Reports
3102 papers in training set
Top 56%
1.8%
11
eLife
5422 papers in training set
Top 44%
1.5%
12
PLOS ONE
4510 papers in training set
Top 58%
1.4%
13
Nature Medicine
117 papers in training set
Top 4%
0.9%
14
Imaging Neuroscience
242 papers in training set
Top 3%
0.9%
15
Nature Machine Intelligence
61 papers in training set
Top 3%
0.9%
16
Nature Human Behaviour
85 papers in training set
Top 4%
0.8%
17
European Respiratory Journal
54 papers in training set
Top 2%
0.8%
18
PLOS Digital Health
91 papers in training set
Top 3%
0.8%
19
Nature Biomedical Engineering
42 papers in training set
Top 2%
0.7%
20
Advanced Science
249 papers in training set
Top 21%
0.7%
21
Science Advances
1098 papers in training set
Top 32%
0.7%
22
JAMIA Open
37 papers in training set
Top 2%
0.5%