Back

LLM-Based Classification of Case Report Abstracts: A Pilot Study on Interactions between Radiotherapy and Systemic Therapies

Dennstaedt, F.; Bobnar, T.; Handra, A.; Putora, P. M.; Filchenko, I.; Brueningk, S.; Aebersold, D. M.; Cihoric, N.; Shelan, M.

2025-12-29 health informatics
10.64898/2025.12.22.25342797 medRxiv
Show abstract

BackgroundThe growing volume of biomedical literature, especially in oncology, necessitates automated tools for extracting clinically relevant information. Large Language Models (LLMs) offer promising capabilities for data extraction in this domain. However, their potential to extract clinically relevant information from case reports detailing rare treatment interactions, remains underexplored. MethodsWe systematically searched PubMed for case reports on interactions between radiotherapy (RT) and Pembrolizumab, Cetuximab, or Cisplatin. A random sample of 100 report abstracts for each therapy was manually classified by two independent medical experts using 17 Boolean questions about patient demographics, treatment, cancer type and outcome with mutually exclusive answers, forming a ground truth. An LLM-based system with the open-source GPT models (GPT-OSS-120B and GPT-OSS-20B) was applied to classify these reports and the remaining dataset entries using the defined question structure. Performance of the LLM-based information extraction was evaluated using the standard classification metrics accuracy, precision, recall, and F1-scores. ResultsThe systematic searches yielded 320 (Pembrolizumab), 147 (Cetuximab), and 2055 (Cisplatin) publications. Inter-rater agreement for manual classification was high (Cohens kappa = 0.87), though lower (0.60-0.80) for specific outcome and cancer type questions. The LLM-based classification (GPT-OSS-120B model) achieved high overall performance with an F1-score of 94.33% (95.83% accuracy, 93.69% precision, 94.98% recall). Performance was consistent across systemic therapies, with the smaller GPT-OSS-20B model showing similar results (F1-score 94.06%). Analysis of the entire datasets revealed that 56.02% of publications described patients who received both RT and systemic therapy. Proportions of positive and negative outcomes varied by therapy and sequencing. ConclusionsLLM-based classification systems demonstrate high accuracy and reliability for curating scientific case reports on RT and systemic therapy interactions. These findings support their potential for high-throughput hypothesis generation and knowledge base construction in oncology, particularly for underutilized case reports, with even smaller open-source models proving effective for such tasks.

Matching journals

The top 7 journals account for 50% of the predicted probability mass.

1
JCO Clinical Cancer Informatics
18 papers in training set
Top 0.1%
23.4%
2
BMC Medical Informatics and Decision Making
39 papers in training set
Top 0.5%
5.0%
3
Bioinformatics
1061 papers in training set
Top 4%
5.0%
4
Journal of the American Medical Informatics Association
61 papers in training set
Top 0.5%
5.0%
5
Artificial Intelligence in Medicine
15 papers in training set
Top 0.1%
5.0%
6
Cancer Medicine
24 papers in training set
Top 0.2%
4.3%
7
BMC Bioinformatics
383 papers in training set
Top 3%
3.7%
50% of probability mass above
8
JMIR Medical Informatics
17 papers in training set
Top 0.3%
3.7%
9
International Journal of Medical Informatics
25 papers in training set
Top 0.4%
3.7%
10
Scientific Reports
3102 papers in training set
Top 33%
3.7%
11
JAMIA Open
37 papers in training set
Top 0.6%
2.2%
12
Journal of Biomedical Informatics
45 papers in training set
Top 0.6%
2.2%
13
npj Digital Medicine
97 papers in training set
Top 2%
2.2%
14
Journal of Medical Internet Research
85 papers in training set
Top 2%
1.8%
15
PLOS ONE
4510 papers in training set
Top 52%
1.8%
16
Computers in Biology and Medicine
120 papers in training set
Top 2%
1.8%
17
Journal of Clinical Epidemiology
28 papers in training set
Top 0.3%
1.7%
18
BMJ Health & Care Informatics
13 papers in training set
Top 0.6%
1.3%
19
BMC Medical Research Methodology
43 papers in training set
Top 0.8%
1.3%
20
Scientific Data
174 papers in training set
Top 2%
1.1%
21
Informatics in Medicine Unlocked
21 papers in training set
Top 0.8%
1.0%
22
Computer Methods and Programs in Biomedicine
27 papers in training set
Top 0.7%
0.9%
23
BMJ Open
554 papers in training set
Top 12%
0.8%
24
The Lancet Digital Health
25 papers in training set
Top 0.9%
0.8%
25
Biology Methods and Protocols
53 papers in training set
Top 2%
0.8%
26
iScience
1063 papers in training set
Top 30%
0.8%
27
Database
51 papers in training set
Top 1.0%
0.7%
28
European Journal of Cancer
10 papers in training set
Top 0.5%
0.7%
29
Frontiers in Artificial Intelligence
18 papers in training set
Top 0.9%
0.7%
30
Frontiers in Digital Health
20 papers in training set
Top 2%
0.7%