Back

Video-based Detection of Delirium in Hospitalized Adults

Mendu, M.; Tesh, R. A.; Pellerin, K.; Steward, G. E.; Cerda, I. H.; Williams, M.; Colman, M.; Shah, S.; Lam, A. D.; Cash, S. S.; Westover, M. B.; Kimchi, E. Y.

2026-05-13 geriatric medicine

10.64898/2026.05.11.26352902 medRxiv

Show abstract

Delirium, a dynamic neuropsychiatric condition associated with morbidity and mortality, remains underdiagnosed due to reliance on subjective, intermittent screening tools. Objective and potentially continuous identification is needed to improve clinical care. We developed and validated an analytic framework for delirium classification based on automatically extracted video features. In this prospective cohort study, patients ([≥] 18 years) admitted to the inpatient medical or neurological ward of a tertiary academic center between August 2020 and March 2022 with an expected stay longer than one night were enrolled. Daily structured delirium assessments and brief video recordings were performed in consenting patients. Videos were analyzed using deep learning pose estimation to extract keypoints and calculate behavioral features based on eye, face, and limb postures and movements. Four machine learning models (logistic regression, gradient boosting, support vector machines, and random forests) were trained to predict delirium status from extracted features. Model performance was evaluated on 20 repetitions of three-fold cross-validation using the area under the curve of the receiver operating characteristics curve (AUC ROC). The cohort included 109 videos from 25 male and 25 female participants (median age: 72, IQR: 63.25-78). Twenty videos (18%) were from patients with delirium. Keypoints for this dataset were more accurately extracted using a customized ResNet-101 model developed with DeepLabCut (sensitivity 0.94, specificity 0.89, compared to human-labeled gold standards) than using off-the-shelf models. Keypoints were then used to generate behavioral features summarizing movement and postures throughout the video. A support vector machine model achieved an average delirium classification AUC ROC of 0.79 (SD {+/-} 0.09), sensitivity of 0.71 (SD {+/-} 0.16), and specificity of 0.78 (SD {+/-} 0.07). This study demonstrates the feasibility of identifying delirium using brief videos in clinically heterogeneous cohorts and reveals novel features for objective identification. Author SummaryDelirium is a sudden change in attention and awareness that commonly affects hospitalized patients. It is linked with longer hospital stays, cognitive decline, and death. Patients with delirium often show changes in movements and behaviors such as slowed movement, restlessness, or excessive scanning of the environment. Since current screening tools rely on intermittent human interactions, they can be subjective and miss the fluctuating nature of delirium, leading to underdiagnosis. We sought to explore whether short video recordings could be used to detect delirium automatically. In our study, we enrolled 50 hospitalized patients and conducted daily delirium assessments and video recordings. We used a machine learning model to analyze patients eye movements, facial expressions, and body postures. We found that video-derived features could be used to identify delirium in a small clinical cohort. While needing further validation in outside cohorts, this study shows an important proof-of-concept for objective delirium monitoring in heterogeneous clinical contexts without adding burden to clinical staff.

Video-based Detection of Delirium in Hospitalized Adults

Matching journals