Prioritising Hospital Complaints: An Innovative Tool Using Large Language Model-Assisted Content Analysis and Machine Learning Algorithms

Sulaiman, M. H.; Muda, N.; Abdul Razak, F.

2025-06-08 health systems and quality improvement

10.1101/2025.06.07.25329193 medRxiv

Show abstract

BackgroundIn clinical settings, patients often express dissatisfaction through narrative speech or written text. However, most complaints management systems still rely on manual review or rulebased methods that fail to capture the severity or urgency of complaints. This leads to inconsistent triage, delayed resolution and missed opportunities for systemic improvement. A novel model leveraging large language model-assisted content analysis (LACA) and machine learning (ML) can transform subjective narratives into standardized, machine-readable severity scores, facilitating the prioritisation of complaints. ObjectiveThis study aims to (1) determine the precision, recall fscore and accuracy of the proposed predictive models used to classify comments into low-alert and high-alert comment, (2) determine the construct validity and internal consistency (Cronbachs ) of the themes found in LACA conducted on hospital web-based review data, (3) determine the predictors of low-alert and high-alert comments and their ability to change the log-odds of the outcome in logistic regression, and (4) to measure the robustness of the explanatory model measured by pseudo-R2. MethodologyLACA was performed using a set of thematic codes to generate an independent variable dataset (x), with a scale of 0: not an issue, 1: a small issue, 2: a moderate issue, 3: a serious issue, and 4: an extremely serious issue. The independent variables (x) and the dependent variable (y, representing the review rating) were then split into training and testing sets to build predictive ML models. Grid search was used to determine the optimal combination of hyperparameters. The performance of the predictive and explanatory models was evaluated. ResultsML classification was able to produce f1-score of 0.88 - 0.94 and accuracy of 0.92 for LR model; and f1-score of 0.87 - 0.94 and accuracy of 0.92 for ANN model The behaviour of predictive models was successfully explained by the explanatory model: Six (6) themes were determined with cumulative explained variance (CEV) of 0.74 and average Cronbachs of 0.86. LR shows significance on 5 themes with pseudo-R2 of 0.55. ConclusionThis study demonstrates that a data pipeline utilizing LACA and ML algorithms shows excellent performance in classifying patient comments in a hospital setting. All effectiveness parameters including CEV, Cronbachs , precision, recall, f1-score, and accuracy indicate strong performance in differentiating high-alert from low-alert comments.

Prioritising Hospital Complaints: An Innovative Tool Using Large Language Model-Assisted Content Analysis and Machine Learning Algorithms

Matching journals