Back

Temporally Phenotyping GLP-1RA Case Reports with Large Language Models: A Textual Time Series Corpus and Risk Modeling

Kumar, S.; Weiss, J.

2026-04-06 endocrinology
10.64898/2026.04.05.26350197 medRxiv
Show abstract

Type 2 diabetes case reports describe complex clinical courses, but their timelines are often expressed in language that is difficult to reuse in longitudinal modeling. To address this gap, we developed a textual time-series corpus of 136 PubMed Open Access single-patient case reports involving glucagon-like peptide 1 receptor agonists, with clinical events associated with their most probable reference times. We evaluated automated LLM timeline extraction against gold-standard timelines annotated by clinical domain experts, assessing how well systems recovered clinical events and their timings. The best-performing LLM produced high event coverage (GPT5; 0.871) and reliable temporal sequencing across symptoms (GPT5; 0.843), diagnoses, treatments, laboratory tests, and outcomes. As a downstream demonstration, time-to-event analyses in diabetes suggested lower risk of respiratory sequelae among GLP-1 users versus non-users (HR=0.259, p<0.05), consistent with prior reports of improved respiratory outcomes. Temporal annotations and code will be released upon acceptance.

Matching journals

The top 4 journals account for 50% of the predicted probability mass.

1
Journal of the American Medical Informatics Association
61 papers in training set
Top 0.1%
22.3%
2
npj Digital Medicine
97 papers in training set
Top 0.2%
18.5%
3
eLife
5422 papers in training set
Top 9%
8.3%
4
Molecular Systems Biology
142 papers in training set
Top 0.1%
4.8%
50% of probability mass above
5
Nature Communications
4913 papers in training set
Top 33%
4.8%
6
Nature Medicine
117 papers in training set
Top 0.7%
3.9%
7
Genome Medicine
154 papers in training set
Top 3%
2.6%
8
Advanced Science
249 papers in training set
Top 9%
2.1%
9
Scientific Reports
3102 papers in training set
Top 51%
2.1%
10
Cell Reports Medicine
140 papers in training set
Top 3%
1.9%
11
European Respiratory Journal
54 papers in training set
Top 0.8%
1.9%
12
JMIR Medical Informatics
17 papers in training set
Top 0.7%
1.8%
13
iScience
1063 papers in training set
Top 13%
1.8%
14
JAMIA Open
37 papers in training set
Top 0.9%
1.7%
15
Journal of Biomedical Informatics
45 papers in training set
Top 0.9%
1.5%
16
eBioMedicine
130 papers in training set
Top 2%
1.5%
17
PLOS Biology
408 papers in training set
Top 16%
0.9%
18
Communications Medicine
85 papers in training set
Top 0.6%
0.9%
19
Metabolites
50 papers in training set
Top 0.9%
0.9%
20
Pharmacoepidemiology and Drug Safety
13 papers in training set
Top 0.4%
0.7%
21
Science Advances
1098 papers in training set
Top 30%
0.7%
22
PLOS ONE
4510 papers in training set
Top 68%
0.7%
23
Frontiers in Pharmacology
100 papers in training set
Top 5%
0.6%
24
Nature Machine Intelligence
61 papers in training set
Top 4%
0.6%