Back

Multidimensional Evaluation of Large Language Models on the AAP In-Service Examination: Assessing Accuracy, Calibration, and Citation Reliability

2025-10-17 dentistry and oral medicine Title + abstract only
View on medRxiv
Show abstract

BackgroundLarge language models (LLMs) have demonstrated rapid advancements in natural language understanding and generation, prompting their integration into biomedical research, clinical practice, and professional education. However, systematic evaluation of LLMs in specialty-specific domains such as dentistry and periodontology remain limited, particularly regarding multidimensional performance metrics. ObjectiveTo conduct a comprehensive, multidimensional assessment of commercially availabl...

Predicted journal destinations