Deep Learning Applied to Diffusion-weighted Imaging for Differentiating Malignant from Benign Breast Tumors without Lesion Segmentation

Mami Iima; Ryosuke Mizuno; Masako Kataoka; Kazuki Tsuji; Toshiki Yamazaki; Akihiko Minami; Maya Honda; Keiho Imanishi; Masahiro Takada; Yuji Nakamoto

doi:10.1148/ryai.240206

Deep Learning Applied to Diffusion-weighted Imaging for Differentiating Malignant from Benign Breast Tumors without Lesion Segmentation

Radiol Artif Intell. 2025 Jan;7(1):e240206. doi: 10.1148/ryai.240206.

Authors

Affiliation

¹ From the Department of Fundamental Development for Advanced Low Invasive Diagnostic Imaging, Nagoya University Graduate School of Medicine, 65 Tsurumai-cho, Showa-ku, Nagoya 466-8550, Japan (M.I.); Department of Diagnostic Imaging and Nuclear Medicine, Kyoto University Graduate School of Medicine, Kyoto, Japan (M.I., M.K., M.H., Y.N.); A.I. System Research, Kyoto, Japan (R.M.); Kyoto University Faculty of Medicine, Kyoto, Japan (K.T., T.Y.); Department of Diagnostic Radiology, Kyoto City Hospital, Kyoto, Japan (A.M.); Department of Diagnostic Radiology, Kansai Electric Power Hospital, Osaka, Japan (M.H.); e-Growth, Kyoto, Japan (K.I.); and Department of Breast Surgery, Kyoto University Graduate School of Medicine, Kyoto, Japan (M.T.).

PMID: 39565222
DOI: 10.1148/ryai.240206

Abstract

Purpose To evaluate and compare the performance of different artificial intelligence (AI) models in differentiating between benign and malignant breast tumors at diffusion-weighted imaging (DWI), including comparison with radiologist assessments. Materials and Methods In this retrospective study, patients with breast lesions underwent 3-T breast MRI from May 2019 to March 2022. In addition to T1-weighted imaging, T2-weighted imaging, and contrast-enhanced imaging, DWI was performed with five b values (0, 200, 800, 1000, and 1500 sec/mm²). DWI data split into training and tuning and test sets were used for the development and assessment of AI models, including a small two-dimensional (2D) convolutional neural network (CNN), ResNet-18, EfficientNet-B0, and a three-dimensional (3D) CNN. Performance of the DWI-based models in differentiating between benign and malignant breast tumors was compared with that of radiologists assessing standard breast MR images, with diagnostic performance assessed using receiver operating characteristic analysis. The study also examined data augmentation effects (augmentation A: random elastic deformation, augmentation B: random affine transformation and random noise, and augmentation C: mixup) on model performance. Results A total of 334 breast lesions in 293 patients (mean age, 54.9 years ± 14.3 [SD]; all female) were analyzed. The 2D CNN models outperformed the 3D CNN on the test dataset (area under the receiver operating characteristic curve [AUC] with different data augmentation methods: range, 0.83-0.88 vs 0.75-0.76). There was no evidence of a difference in performance between the small 2D CNN with augmentations A and B (AUC: 0.88) and the radiologists (AUC: 0.86) on the test dataset (P = .64). When comparing the small 2D CNN to radiologists, there was no evidence of a difference in specificity (81.4% vs 72.1%, P = .64) or sensitivity (85.9% vs 98.8%, P = .64). Conclusion AI models, particularly a small 2D CNN, showed good performance in differentiating between malignant and benign breast tumors using DWI, without needing manual segmentation. Keywords: MR Imaging, Breast, Comparative Studies, Feature Detection, Diagnosis Supplemental material is available for this article. ©RSNA, 2024.

Keywords: Breast; Comparative Studies; Diagnosis; Feature Detection; MR Imaging.

MeSH terms

Adult
Aged
Breast / diagnostic imaging
Breast / pathology
Breast Neoplasms* / diagnostic imaging
Breast Neoplasms* / pathology
Deep Learning*
Diagnosis, Differential
Diffusion Magnetic Resonance Imaging* / methods
Female
Humans
Image Interpretation, Computer-Assisted / methods
Middle Aged
Retrospective Studies