Back

Artificial Intelligence in Neuro-Oncology: Assessing ChatGPT Accuracy in MRI Interpretation and Treatment Advice

Ishaque, A. H.; Boutet, A.; Hiremath, S. B.; Mullarkey, M. P.; Peris-Celda, M.; Zadeh, G.

2025-04-22 oncology
10.1101/2025.04.22.25326204 medRxiv
Show abstract

PurposeLarge language models (LLMs) have demonstrated advanced capabilities in interpreting text and visual inputs. Their potential to transform oncological practice is significant, but their accuracy and reliability in interpreting medical imaging and offering management suggestions remain underexplored. This study aimed to evaluate the performance of ChatGPT in interpreting T1-weighted contrast-enhanced MRI images of meningiomas and glioblastomas and providing treatment recommendations based on simulated patient inquiries. MethodsThis observational cohort study utilized publicly available MRI datasets. Thirty cases of meningiomas and glioblastomas were randomly selected, yielding 90 images (three orthogonal planes per case). ChatGPT-4o was tasked with interpreting these images and responding to six standardized patient-simulated questions. Two neuroradiologists and neurosurgeons assessed ChatGPTs performance using five-point Likert scales and their inter-rater agreement was evaluated. ResultsChatGPT identified MRI sequences with 91.7% accuracy and localized tumors correctly in 66.7% of cases. Tumor size was qualitatively described in 85% of cases, and the median acceptability was rated as 4.0 (IQR 4.0-5.0) by neuroradiologists. ChatGPT included meningioma in the differential diagnosis for 73.3% of meningioma cases and glioma in 83.3% of glioblastoma cases. Inter-rater agreement among neuroradiologists ranged from moderate to good ({kappa} = 0.45-0.72). While surgical treatment was suggested in all symptomatic cases, neurosurgeon acceptability ratings varied, with poor inter-rater reliability. ConclusionsChatGPT demonstrates potential in interpreting neuro-oncological MRI images and offering preliminary management recommendations. However, errors in tumor localization and variability in recommendation acceptability underscore the need for physician oversight and further refinement of LLMs before clinical integration.

Matching journals

The top 8 journals account for 50% of the predicted probability mass.

1
Artificial Intelligence in Medicine
15 papers in training set
Top 0.1%
12.9%
2
JCO Clinical Cancer Informatics
18 papers in training set
Top 0.1%
12.6%
3
Scientific Reports
3102 papers in training set
Top 22%
4.9%
4
Biology Methods and Protocols
53 papers in training set
Top 0.1%
4.9%
5
npj Digital Medicine
97 papers in training set
Top 0.9%
4.9%
6
Neuro-Oncology Advances
24 papers in training set
Top 0.1%
4.2%
7
PLOS ONE
4510 papers in training set
Top 36%
3.9%
8
European Radiology
14 papers in training set
Top 0.2%
3.7%
50% of probability mass above
9
Frontiers in Oncology
95 papers in training set
Top 1%
3.7%
10
Computers in Biology and Medicine
120 papers in training set
Top 1%
3.1%
11
PLOS Computational Biology
1633 papers in training set
Top 11%
2.9%
12
JAMA Network Open
127 papers in training set
Top 2%
2.1%
13
npj Precision Oncology
48 papers in training set
Top 0.5%
1.7%
14
JCO Precision Oncology
14 papers in training set
Top 0.2%
1.4%
15
Cancers
200 papers in training set
Top 4%
1.2%
16
Diagnostics
48 papers in training set
Top 1%
1.2%
17
Radiotherapy and Oncology
18 papers in training set
Top 0.2%
1.2%
18
BMJ Open
554 papers in training set
Top 11%
1.2%
19
Clinical Cancer Research
58 papers in training set
Top 1%
1.2%
20
Journal of Translational Medicine
46 papers in training set
Top 2%
1.1%
21
JMIR Medical Informatics
17 papers in training set
Top 1%
1.0%
22
BMJ Health & Care Informatics
13 papers in training set
Top 0.7%
1.0%
23
Frontiers in Computational Neuroscience
53 papers in training set
Top 2%
0.9%
24
BMC Cancer
52 papers in training set
Top 2%
0.9%
25
Medical Physics
14 papers in training set
Top 0.5%
0.9%
26
PeerJ
261 papers in training set
Top 12%
0.9%
27
Annals of Biomedical Engineering
34 papers in training set
Top 1%
0.8%
28
BMC Bioinformatics
383 papers in training set
Top 6%
0.8%
29
European Journal of Cancer
10 papers in training set
Top 0.4%
0.8%
30
BMC Medical Informatics and Decision Making
39 papers in training set
Top 2%
0.8%