Deep learning-based algorithm accurately classifies sleep stages in preadolescent children with sleep-disordered breathing symptoms and age-matched controls

Front Neurol. 2023 Apr 14:14:1162998. doi: 10.3389/fneur.2023.1162998. eCollection 2023.

Abstract

Introduction: Visual sleep scoring has several shortcomings, including inter-scorer inconsistency, which may adversely affect diagnostic decision-making. Although automatic sleep staging in adults has been extensively studied, it is uncertain whether such sophisticated algorithms generalize well to different pediatric age groups due to distinctive EEG characteristics. The preadolescent age group (10-13-year-olds) is relatively understudied, and thus, we aimed to develop an automatic deep learning-based sleep stage classifier specifically targeting this cohort.

Methods: A dataset (n = 115) containing polysomnographic recordings of Icelandic preadolescent children with sleep-disordered breathing (SDB) symptoms, and age and sex-matched controls was utilized. We developed a combined convolutional and long short-term memory neural network architecture relying on electroencephalography (F4-M1), electrooculography (E1-M2), and chin electromyography signals. Performance relative to human scoring was further evaluated by analyzing intra- and inter-rater agreements in a subset (n = 10) of data with repeat scoring from two manual scorers.

Results: The deep learning-based model achieved an overall cross-validated accuracy of 84.1% (Cohen's kappa κ = 0.78). There was no meaningful performance difference between SDB-symptomatic (n = 53) and control subgroups (n = 52) [83.9% (κ = 0.78) vs. 84.2% (κ = 0.78)]. The inter-rater reliability between manual scorers was 84.6% (κ = 0.78), and the automatic method reached similar agreements with scorers, 83.4% (κ = 0.76) and 82.7% (κ = 0.75).

Conclusion: The developed algorithm achieved high classification accuracy and substantial agreements with two manual scorers; the performance metrics compared favorably with typical inter-rater reliability between manual scorers and performance reported in previous studies. These suggest that our algorithm may facilitate less labor-intensive and reliable automatic sleep scoring in preadolescent children.

Keywords: Pediatric sleep staging; community controls; convolutional neural network; deep learning; inter-rater reliability; pediatric sleep-disordered breathing; preadolescent cohort; recurrent neural network.