Back

Patient-Centred Communication in Lung Cancer Screening: A Clinically Focussed Evaluation of a Fine-Tuned Open-Source Model Against a Larger Frontier System

Khanna, S.; Chaudhary, R.; Narula, N.; Lee, R.

2026-04-11 oncology
10.64898/2026.04.10.26350595 medRxiv
Show abstract

Lung cancer screening saves lives, yet uptake remains suboptimal and inequitable. Personalised communication can improve attendance and reduce anxiety, but scaling such support is a workforce challenge. We fine-tuned Googles Gemma 2 9B using QLoRA on 5,086 synthetic screening conversations and compared it against Googles Gemini 2.5 Flash (a larger frontier model) and an unmodified baseline across 300 multi-turn conversations with 100 patient personas spanning ten clinical categories. Evaluation combined automated natural language processing metrics with independent language model judgement in two complementary modes: structured clinical rubric and simulated patient persona. The fine-tuned model achieved the highest simulated patient experience score (3.71/5 vs 3.65 for the frontier model), recorded zero boundary violations after clinician review of all flagged instances, and led on the four most safety-critical categories. A composite Patient Adaptation Index showed that the fine-tuned model led overall (0.37 vs 0.35 vs 0.35), with its clearest advantage on the two clinically specific components: empathy calibration to patient distress and selective smoking cessation signposting. These findings suggest that targeted fine-tuning of open-source models can yield clinical communication quality comparable to larger proprietary systems, with advantages in safety-critical scenarios and suitability for NHS data governance constraints. Human clinician review of these conversations is ongoing.

Matching journals

The top 3 journals account for 50% of the predicted probability mass.

1
npj Digital Medicine
97 papers in training set
Top 0.1%
43.1%
2
Nature Communications
4913 papers in training set
Top 26%
6.9%
3
Scientific Reports
3102 papers in training set
Top 26%
4.5%
50% of probability mass above
4
JCO Clinical Cancer Informatics
18 papers in training set
Top 0.2%
3.9%
5
Proceedings of the National Academy of Sciences
2130 papers in training set
Top 18%
3.9%
6
PLOS ONE
4510 papers in training set
Top 43%
3.0%
7
Nature Medicine
117 papers in training set
Top 2%
1.8%
8
eLife
5422 papers in training set
Top 39%
1.8%
9
iScience
1063 papers in training set
Top 17%
1.6%
10
JAMA Network Open
127 papers in training set
Top 2%
1.6%
11
Frontiers in Digital Health
20 papers in training set
Top 0.8%
1.3%
12
PLOS Digital Health
91 papers in training set
Top 2%
1.3%
13
Journal of Medical Internet Research
85 papers in training set
Top 3%
1.3%
14
Computers in Biology and Medicine
120 papers in training set
Top 3%
1.3%
15
PLOS Computational Biology
1633 papers in training set
Top 20%
1.2%
16
Nature Human Behaviour
85 papers in training set
Top 4%
0.8%
17
JMIR Formative Research
32 papers in training set
Top 1%
0.8%
18
Database
51 papers in training set
Top 0.9%
0.8%
19
Artificial Intelligence in Medicine
15 papers in training set
Top 0.8%
0.7%
20
BMJ Open
554 papers in training set
Top 13%
0.7%
21
Journal of Translational Medicine
46 papers in training set
Top 3%
0.7%
22
Interface Focus
14 papers in training set
Top 0.4%
0.5%
23
Journal of Biomedical Informatics
45 papers in training set
Top 2%
0.5%
24
EClinicalMedicine
21 papers in training set
Top 1%
0.5%
25
Journal of the American Medical Informatics Association
61 papers in training set
Top 2%
0.5%
26
Biology Methods and Protocols
53 papers in training set
Top 3%
0.5%