Back

Temporally Continuous Automated Sleep-Wake Classification Using Deep Learning

Somaskandhan, P.; Korkalainen, H.; Leppänen, T.; Töyräs, J.; Melehan, K.; Ruehland, W.; Sands, S. A.; Mann, D. L.; Wilson, D. L.; Terrill, P. I.

2025-12-04 health informatics

10.64898/2025.12.03.25341129 medRxiv

Show abstract

IntroductionSegmenting sleep into fixed 30-second epochs remains central to current sleep scoring practice, yet it imposes rigid boundaries that may not accurately reflect the true temporal sleep dynamics. We aimed to develop a deep learning-based, high-temporal-resolution sleep-wake classifier leveraging temporally continuous manual reference scoring without fixed epoch boundaries and transfer learning techniques to facilitate progress toward a more physiologically consistent sleep assessment. MethodsThree independent datasets were utilized, of which two included sleep-wake scoring manually conducted in a temporally continuous manner. A U-Net based model was initially trained on a large dataset scored using 30-second epochs, with post hoc scoring modifications (n=2034). It was then fine-tuned via transfer learning using a subset of one of the datasets with temporally continuous scoring (n=39) and validated on both its holdout portion (n=40) and the other independent temporally continuous scoring dataset (n=20). Wakefulness and arousals were consolidated, acknowledging their shared physiological characteristics. Prediction confidence estimates were also generated. ResultsThe model achieved overall concordance of 88.96% ({kappa}=0.78) and 88.23% ({kappa}=0.76) in the holdout and second independent evaluation dataset, respectively, with temporally continuous scoring. Correlation between 1-second automatic predictions and temporally continuous manual scoring was r=0.93 (p<0.001) for total sleep time and r=0.67 (p<0.001) for sleep-to-wake transition index. ConclusionsThese findings support the utility of our model in addressing key limitations of 30-second epoch-based scoring and progressing toward more physiologically consistent sleep-wake assessment by providing a practical basis for subsequent analyses. Misclassifications generally showed lower confidences, indicating additional value for targeted review. Statement of SignificanceConventional sleep scoring remains constrained by fixed 30-second epochs, which may fail to capture the true temporal dynamics of the underlying changes between sleep and wakefulness. In this study, we used polysomnography data manually scored on a temporally continuous basis as the gold standard to develop and validate a deep learning model capable of classifying sleep and wakefulness-like states (consolidating wakefulness and arousal) at high temporal resolution without fixed 30-second epochs. The model demonstrated strong agreement with the gold standard, and as such, lays a practical foundation for deriving improved physiologically meaningful biomarkers of sleep fragmentation and continuity, with potential diagnostic and prognostic value and broad applicability toward a more precise and physiologically consistent sleep assessment.

Temporally Continuous Automated Sleep-Wake Classification Using Deep Learning

Matching journals