Back

JADE: Jawbone Lesion Diagnosis and Decision Supporting System

Baseri Saadi, S.; Ver Berne, J.; Cavalcante Fontenele, R.; Claes, P.; Jacobs, R.

2026-02-01 pathology
10.64898/2026.01.26.26344704 medRxiv
Show abstract

ObjectivesTo develop and evaluate JADE, a proof-of-concept retrieval-augmented generation (RAG) diagnostic assistive system was designed to enhance large language model (LLM) reasoning for the assessment of jawbone lesions. This study examined whether integrating structured retrieval with GPT-5 improves diagnostic accuracy and stability compared with standalone LLMs. MethodsJADE was developed as a cloud-based application integrating GPT-5 with a curated oral radiology and pathology database using a hybrid semantic-keyword retrieval strategy. Clinical and radiographic characteristics were imported as a structured query to guide retrieval and support diagnostic reasoning. Performance was compared with standalone GPT-5, Claude Sonnet 4.5, DeepSeek-R1, and Gemini 2.5 Flash across 25 cases. Accuracy was analysed using Cochrans Q test with post-hoc McNemars tests and Bonferroni correction. Intra-model stability was measured using the majority agreement ratio, and response time was recorded to assess real-time usability. ResultsJADE showed the highest diagnostic performance, correctly identifying 20 out of 25 cases and outperforming all standalone LLMs. Significant differences were observed across models (Cochrans Q = 33.2, df = 4, p < 0.001), with post-hoc analyses confirming that JADE significantly outperformed GPT-5, Gemini 2.5 Flash, and Claude Sonnet 4.5 (p < 0.01). JADE also exhibited the greatest run-to-run stability (mean MAR = 0.90 {+/-} 0.18). The average prediction time of 6 {+/-} 0.5 seconds supported its feasibility for real-time clinical use. ConclusionsJADE improved diagnostic accuracy and stability over standalone LLMs, underscoring the value of RAG reasoning in jawbone lesion assessment and its potential for real-time clinical use.

Matching journals

The top 5 journals account for 50% of the predicted probability mass.

1
PLOS ONE
4510 papers in training set
Top 7%
19.9%
2
Journal of Pathology Informatics
13 papers in training set
Top 0.1%
15.7%
3
Cureus
67 papers in training set
Top 0.3%
9.0%
4
npj Digital Medicine
97 papers in training set
Top 0.8%
5.2%
5
Scientific Reports
3102 papers in training set
Top 21%
5.2%
50% of probability mass above
6
Journal of Medical Internet Research
85 papers in training set
Top 1.0%
4.6%
7
Computers in Biology and Medicine
120 papers in training set
Top 0.6%
4.2%
8
European Radiology
14 papers in training set
Top 0.3%
2.2%
9
GigaScience
172 papers in training set
Top 1%
1.9%
10
BMC Medical Informatics and Decision Making
39 papers in training set
Top 1%
1.8%
11
Computational and Structural Biotechnology Journal
216 papers in training set
Top 5%
1.4%
12
Diagnostics
48 papers in training set
Top 1%
1.4%
13
Modern Pathology
21 papers in training set
Top 0.3%
1.2%
14
Journal of Visualized Experiments
30 papers in training set
Top 0.5%
1.0%
15
IEEE Journal of Biomedical and Health Informatics
34 papers in training set
Top 2%
1.0%
16
PLOS Computational Biology
1633 papers in training set
Top 23%
0.8%
17
Frontiers in Oncology
95 papers in training set
Top 3%
0.8%
18
Journal of Clinical Pathology
12 papers in training set
Top 0.4%
0.8%
19
Alzheimer's & Dementia: Diagnosis, Assessment & Disease Monitoring
38 papers in training set
Top 1.0%
0.8%
20
JMIR Medical Informatics
17 papers in training set
Top 1%
0.8%
21
iScience
1063 papers in training set
Top 29%
0.8%
22
Biology Methods and Protocols
53 papers in training set
Top 2%
0.8%
23
The Lancet Digital Health
25 papers in training set
Top 1%
0.7%
24
Frontiers in Medicine
113 papers in training set
Top 7%
0.7%
25
European Journal of Nuclear Medicine and Molecular Imaging
19 papers in training set
Top 0.4%
0.5%
26
Bioengineering
24 papers in training set
Top 2%
0.5%
27
JAMA Network Open
127 papers in training set
Top 5%
0.5%
28
PLOS Digital Health
91 papers in training set
Top 3%
0.5%