Impact of signal intensity normalization of MRI on the generalizability of radiomic-based prediction of molecular glioma subtypes

Eur Radiol. 2024 Apr;34(4):2782-2790. doi: 10.1007/s00330-023-10034-2. Epub 2023 Sep 6.

Abstract

Objectives: Radiomic features have demonstrated encouraging results for non-invasive detection of molecular biomarkers, but the lack of guidelines for pre-processing MRI-data has led to poor generalizability. Here, we assessed the influence of different MRI-intensity normalization techniques on the performance of radiomics-based models for predicting molecular glioma subtypes.

Methods: Preoperative MRI-data from n = 615 patients with newly diagnosed glioma and known isocitrate dehydrogenase (IDH) and 1p/19q status were pre-processed using four different methods: no normalization (naive), N4 bias field correction (N4), N4 followed by either WhiteStripe (N4/WS), or z-score normalization (N4/z-score). A total of 377 Image-Biomarker-Standardisation-Initiative-compliant radiomic features were extracted from each normalized data, and 9 different machine-learning algorithms were trained for multiclass prediction of molecular glioma subtypes (IDH-mutant 1p/19q codeleted vs. IDH-mutant 1p/19q non-codeleted vs. IDH wild type). External testing was performed in public glioma datasets from UCSF (n = 410) and TCGA (n = 160).

Results: Support vector machine yielded the best performance with macro-average AUCs of 0.84 (naive), 0.84 (N4), 0.87 (N4/WS), and 0.87 (N4/z-score) in the internal test set. Both N4/WS and z-score outperformed the other approaches in the external UCSF and TCGA test sets with macro-average AUCs ranging from 0.85 to 0.87, replicating the performance of the internal test set, in contrast to macro-average AUCs ranging from 0.19 to 0.45 for naive and 0.26 to 0.52 for N4 alone.

Conclusion: Intensity normalization of MRI data is essential for the generalizability of radiomic-based machine-learning models. Specifically, both N4/WS and N4/z-score approaches allow to preserve the high model performance, yielding generalizable performance when applying the developed radiomic-based machine-learning model in an external heterogeneous, multi-institutional setting.

Clinical relevance statement: Intensity normalization such as N4/WS or N4/z-score can be used to develop reliable radiomics-based machine learning models from heterogeneous multicentre MRI datasets and provide non-invasive prediction of glioma subtypes.

Key points: • MRI-intensity normalization increases the stability of radiomics-based models and leads to better generalizability. • Intensity normalization did not appear relevant when the developed model was applied to homogeneous data from the same institution. • Radiomic-based machine learning algorithms are a promising approach for simultaneous classification of IDH and 1p/19q status of glioma.

Keywords: Genotype; Glioma; Isocitrate dehydrogenase; Magnetic resonance imaging.

MeSH terms

  • Biomarkers
  • Brain Neoplasms* / diagnostic imaging
  • Brain Neoplasms* / genetics
  • Glioma* / diagnostic imaging
  • Glioma* / genetics
  • Humans
  • Isocitrate Dehydrogenase / genetics
  • Magnetic Resonance Imaging / methods
  • Mutation
  • Radiomics
  • Retrospective Studies

Substances

  • Biomarkers
  • Isocitrate Dehydrogenase