Assessing the Performance of Models from the 2022 RSNA Cervical Spine Fracture Detection Competition at a Level I Trauma Center

Radiol Artif Intell. 2024 Nov;6(6):e230550. doi: 10.1148/ryai.230550.

Abstract

Purpose To evaluate the performance of the top models from the RSNA 2022 Cervical Spine Fracture Detection challenge on a clinical test dataset of both noncontrast and contrast-enhanced CT scans acquired at a level I trauma center. Materials and Methods Seven top-performing models in the RSNA 2022 Cervical Spine Fracture Detection challenge were retrospectively evaluated on a clinical test set of 1828 CT scans (from 1829 series: 130 positive for fracture, 1699 negative for fracture; 1308 noncontrast, 521 contrast enhanced) from 1779 patients (mean age, 55.8 years ± 22.1 [SD]; 1154 [64.9%] male patients). Scans were acquired without exclusion criteria over 1 year (January-December 2022) from the emergency department of a neurosurgical and level I trauma center. Model performance was assessed using area under the receiver operating characteristic curve (AUC), sensitivity, and specificity. False-positive and false-negative cases were further analyzed by a neuroradiologist. Results Although all seven models showed decreased performance on the clinical test set compared with the challenge dataset, the models maintained high performances. On noncontrast CT scans, the models achieved a mean AUC of 0.89 (range: 0.79-0.92), sensitivity of 67.0% (range: 30.9%-80.0%), and specificity of 92.9% (range: 82.1%-99.0%). On contrast-enhanced CT scans, the models had a mean AUC of 0.88 (range: 0.76-0.94), sensitivity of 81.9% (range: 42.7%-100.0%), and specificity of 72.1% (range: 16.4%-92.8%). The models identified 10 fractures missed by radiologists. False-positive cases were more common in contrast-enhanced scans and observed in patients with degenerative changes on noncontrast scans, while false-negative cases were often associated with degenerative changes and osteopenia. Conclusion The winning models from the 2022 RSNA AI Challenge demonstrated a high performance for cervical spine fracture detection on a clinical test dataset, warranting further evaluation for their use as clinical support tools. Keywords: Feature Detection, Supervised Learning, Convolutional Neural Network (CNN), Genetic Algorithms, CT, Spine, Technology Assessment, Head/Neck Supplemental material is available for this article. © RSNA, 2024 See also commentary by Levi and Politi in this issue.

Keywords: CT; Convolutional Neural Network (CNN); Feature Detection; Genetic Algorithms; Head/Neck; Spine; Supervised Learning; Technology Assessment.

MeSH terms

  • Adult
  • Cervical Vertebrae* / diagnostic imaging
  • Cervical Vertebrae* / injuries
  • Contrast Media
  • Female
  • Humans
  • Male
  • Middle Aged
  • Retrospective Studies
  • Sensitivity and Specificity
  • Spinal Fractures* / diagnosis
  • Spinal Fractures* / diagnostic imaging
  • Tomography, X-Ray Computed* / methods
  • Trauma Centers*

Substances

  • Contrast Media