The STOIC2021 COVID-19 AI challenge: Applying reusable training methodologies to private data

Med Image Anal. 2024 Oct:97:103230. doi: 10.1016/j.media.2024.103230. Epub 2024 Jun 5.

Abstract

Challenges drive the state-of-the-art of automated medical image analysis. The quantity of public training data that they provide can limit the performance of their solutions. Public access to the training methodology for these solutions remains absent. This study implements the Type Three (T3) challenge format, which allows for training solutions on private data and guarantees reusable training methodologies. With T3, challenge organizers train a codebase provided by the participants on sequestered training data. T3 was implemented in the STOIC2021 challenge, with the goal of predicting from a computed tomography (CT) scan whether subjects had a severe COVID-19 infection, defined as intubation or death within one month. STOIC2021 consisted of a Qualification phase, where participants developed challenge solutions using 2000 publicly available CT scans, and a Final phase, where participants submitted their training methodologies with which solutions were trained on CT scans of 9724 subjects. The organizers successfully trained six of the eight Final phase submissions. The submitted codebases for training and running inference were released publicly. The winning solution obtained an area under the receiver operating characteristic curve for discerning between severe and non-severe COVID-19 of 0.815. The Final phase solutions of all finalists improved upon their Qualification phase solutions.

Keywords: COVID-19; Machine learning; Medical image analysis challenge.

MeSH terms

  • Artificial Intelligence
  • COVID-19*
  • Humans
  • SARS-CoV-2*
  • Tomography, X-Ray Computed*