Revisiting the Endoscopic Third Ventriculostomy Success Score using machine learning: can we do better?

J Neurosurg Pediatr. 2024 Dec 6:1-9. doi: 10.3171/2024.9.PEDS24146. Online ahead of print.

Abstract

Objective: The Endoscopic Third Ventriculostomy Success Score (ETVSS) is a useful decision-making heuristic when considering the probability of surgical success, defined traditionally as no repeat cerebrospinal fluid diversion surgery needed within 6 months. Nonetheless, the performance of the logistic regression (LR) model in the original 2009 study was modest, with an area under the receiver operating characteristic curve (AUROC) of 0.68. The authors sought to use a larger dataset to develop more accurate machine learning (ML) models to predict endoscopic third ventriculostomy (ETV) success and also to perform the largest validation of the ETVSS to date.

Methods: The authors queried the MarketScan national database for the years 2005-2022 to identify patients < 18 years of age who underwent first-time ETV and subsequently had at least 6 months of continuous enrollment in the database. The authors collected data on predictors matching the original ETVSS: age, etiology of hydrocephalus, and history of any previous shunt placement. Next, they used 6 ML algorithms-LR, support vector classifier, random forest, k-nearest neighbors, Extreme Gradient Boosted Regression (XGBoost), and naive Bayes-to develop predictive models. Finally, the authors used nested cross-validation to assess the models' comparative performances on unseen data.

Results: The authors identified 2047 patients who met inclusion criteria, and 1261 (61.6%) underwent successful ETV. The performances of most ML models were similar to that of the original ETVSS, which had an AUROC of 0.693 on the validation set and 0.661 (95% CI 0.600-0.722) on the test set. The authors' new LR model performed comparably with AUROCs of 0.693 on both the validation and test sets, with 95% CI 0.633-0.754 on the test set. Among the more complex ML algorithms, XGBoost performed best, with AUROCs of 0.683 and 0.672 (95% CI 0.609-0.734) on the validation and test sets, respectively.

Conclusions: This is the largest external validation of the ETVSS, and it confirms modest performance. More sophisticated ML algorithms do not meaningfully improve predictive performance compared to ETVSS; this underscores the need for higher utility, novelty, and dimensionality of input data rather than changes in modeling strategies.

Keywords: data science; endoscopic third ventriculostomy; hydrocephalus; machine learning; pediatric neurosurgery; predictive modeling.