Objective: To determine whether machine learning techniques would enhance our ability to incorporate key variables into a parsimonious model with optimized prediction performance for electroencephalographic seizure (ES) prediction in critically ill children.
Methods: We analyzed data from a prospective observational cohort study of 719 consecutive critically ill children with encephalopathy who underwent clinically-indicated continuous EEG monitoring (CEEG). We implemented and compared three state-of-the-art machine learning methods for ES prediction: (1) random forest; (2) Least Absolute Shrinkage and Selection Operator (LASSO); and (3) Deep Learning Important FeaTures (DeepLIFT). We developed a ranking algorithm based on the relative importance of each variable derived from the machine learning methods.
Results: Based on our ranking algorithm, the top five variables for ES prediction were: (1) epileptiform discharges in the initial 30 minutes, (2) clinical seizures prior to CEEG initiation, (3) sex, (4) age dichotomized at 1 year, and (5) epileptic encephalopathy. Compared to the stepwise selection-based approach in logistic regression, the top variables selected by our ranking algorithm were more informative as models utilizing the top variables achieved better prediction performance evaluated by prediction accuracy, AUROC and F1 score. Adding additional variables did not improve and sometimes worsened model performance.
Conclusion: The ranking algorithm was helpful in deriving a parsimonious model for ES prediction with optimal performance. However, application of state-of-the-art machine learning models did not substantially improve model performance compared to prior logistic regression models. Thus, to further improve the ES prediction, we may need to collect more samples and variables that provide additional information.
Keywords: EEG monitoring; Electroencephalogram; Machine learning; Pediatric; Seizure.
Copyright © 2021. Published by Elsevier Ltd.