Improved Performance of ChatGPT-4 on the OKAP Exam: A Comparative Study with ChatGPT-3.5
Teebagy, S.; Colwell, L.; Wood, E.; Yaghy, A.; Faustina, M.
Show abstract
This study aims to evaluate the performance of ChatGPT-4, an advanced Artificial Intelligence (AI) language model, on the Ophthalmology Knowledge Assessment Program (OKAP) examination compared to its predecessor, ChatGPT-3.5. Both models were tested on 180 OKAP practice questions covering various ophthalmology subject categories. Results showed that ChatGPT-4 significantly outperformed ChatGPT-3.5 (81% vs. 57%; p<0.001), indicating improvements in medical knowledge assessment. The superior performance of ChatGPT-4 suggests potential applicability in ophthalmologic education and clinical decision support systems. Future research should focus on refining AI models, ensuring a balanced representation of fundamental and specialized knowledge, and determining the optimal method of integrating AI into medical education and practice.
Matching journals
The top 4 journals account for 50% of the predicted probability mass.