Objective: To investigate whether machine learning analysis of multiparametric MR radiomics can help classify immunohistochemical (IHC) subtypes of breast cancer. Study design: One hundred and thirty-four consecutive patients with pathologically-proven invasive ductal carcinoma were retrospectively analyzed. A total of 2,498 features were extracted from the DCE and DWI images, together with the new calculated images, including DCE images changing over six time points (DCEsequential) and DWI images changing over three b-values (DWIsequential). We proposed a novel two-stage feature selection method combining traditional statistics and machine learning-based methods. The accuracies of the 4-IHC classification and triple negative (TN) vs. non-TN cancers were assessed. Results: For the 4-IHC classification task, the best accuracy of 72.4% was achieved based on linear discriminant analysis (LDA) or subspace discrimination of assembled learning in conjunction with 20 selected features, and only small dependent emphasis of Kendall-tau-b for sequential features, based on the DWIsequential with the LDA model, yielding an accuracy of 53.7%. The linear support vector machine (SVM) and medium k-nearest neighbor using eight features yielded the highest accuracy of 91.0% for comparing TN to non-TN cancers, and the maximum variance for DWIsequential alone, together with a linear SVM model, achieved an accuracy of 83.6%. Conclusions: Whole-tumor radiomics on MR multiparametric images, DCE images changing over time points, and DWI images changing over different b-values provide a non-invasive analytical approach for breast cancer subtype classification and TN cancer identification.
Keywords: breast cancer; diffusion-weighted imaging; dynamic contrast-enhanced imaging; immunohistochemical subtypes; machine learning; radiomics.