Can ChatGPT give holistic and accurate patient-centred information to oncology patients? A mixed-methods evaluation with stakeholders
Sun, M.; Reiter, E.; Murchie, P.; Kiltie, A. E.; Ramsay, G.; Duncan, L.; Adam, R.
Show abstract
ObjectiveMore people than ever before are living with cancer. Patient education is a core component of cancer care, and patients are increasingly using large language models (LLMs), such as ChatGPT, for advice. The objectives of this study were to evaluate the ability of ChatGPT to explain specialist cancer care records (multidisciplinary team (MDT) meeting reports) to patients and to understand key stakeholders views and opinions about the technology. MethodsSix simulated MDT meeting reports were created by cancer clinicians. MDT reports and 184 realistic patient-centred queries were input into ChatGPT4.0 web version. We conducted a mixed-methods study combining qualitative analysis with exploratory quantitative components to evaluate ChatGPTs responses. The study consisted of three stages: (1) Clinician sense-checking, (2) Clinical and non-clinical annotation, (3) focus groups (including cancer patients, caregivers, computer scientists, and clinicians). ResultsChatGPT was able to summarise complex oncology information into simpler language, to provide definitions of complex terms and to answer questions about clinical care. However, clinician sense-checking identified problems with accuracy, language and content. In clinician annotation, 92.6% of ChatGPTs responses were judged problematic. Across all evaluation methods, six recurring themes were identified: accuracy, language, trust, content, personalisation and integration challenges. Patients and clinicians found the summaries and definitions useful; however, the responses were not tailored to the individual patient or to what the report might mean for them. ConclusionThis study highlights current challenges in using LLMs to explain complex cancer diagnoses and treatment records, including inaccurate information, inappropriate language, limited personalisation, AI distrust and challenges in integrating LLMs into clinical workflow. Understanding of the limitations is crucial for clinicians, patients, computer scientists and policy makers. The issues should be addressed before deploying LLMs in clinical settings.
Matching journals
The top 8 journals account for 50% of the predicted probability mass.