Objectives: Radiomic features have demonstrated encouraging results for non-invasive detection of molecular biomarkers, but the lack of guidelines for pre-processing MRI-data has led to poor generalizability. Here, we assessed the influence of different MRI-intensity normalization techniques on the performance of radiomics-based models for predicting molecular glioma subtypes.
Methods: Preoperative MRI-data from n = 615 patients with newly diagnosed glioma and known isocitrate dehydrogenase (IDH) and 1p/19q status were pre-processed using four different methods: no normalization (naive), N4 bias field correction (N4), N4 followed by either WhiteStripe (N4/WS), or z-score normalization (N4/z-score). A total of 377 Image-Biomarker-Standardisation-Initiative-compliant radiomic features were extracted from each normalized data, and 9 different machine-learning algorithms were trained for multiclass prediction of molecular glioma subtypes (IDH-mutant 1p/19q codeleted vs. IDH-mutant 1p/19q non-codeleted vs. IDH wild type). External testing was performed in public glioma datasets from UCSF (n = 410) and TCGA (n = 160).
Results: Support vector machine yielded the best performance with macro-average AUCs of 0.84 (naive), 0.84 (N4), 0.87 (N4/WS), and 0.87 (N4/z-score) in the internal test set. Both N4/WS and z-score outperformed the other approaches in the external UCSF and TCGA test sets with macro-average AUCs ranging from 0.85 to 0.87, replicating the performance of the internal test set, in contrast to macro-average AUCs ranging from 0.19 to 0.45 for naive and 0.26 to 0.52 for N4 alone.
Conclusion: Intensity normalization of MRI data is essential for the generalizability of radiomic-based machine-learning models. Specifically, both N4/WS and N4/z-score approaches allow to preserve the high model performance, yielding generalizable performance when applying the developed radiomic-based machine-learning model in an external heterogeneous, multi-institutional setting.
Clinical relevance statement: Intensity normalization such as N4/WS or N4/z-score can be used to develop reliable radiomics-based machine learning models from heterogeneous multicentre MRI datasets and provide non-invasive prediction of glioma subtypes.
Key points: • MRI-intensity normalization increases the stability of radiomics-based models and leads to better generalizability. • Intensity normalization did not appear relevant when the developed model was applied to homogeneous data from the same institution. • Radiomic-based machine learning algorithms are a promising approach for simultaneous classification of IDH and 1p/19q status of glioma.
Keywords: Genotype; Glioma; Isocitrate dehydrogenase; Magnetic resonance imaging.
© 2023. The Author(s).