Cross-site Validation of AI Segmentation and Harmonization in Breast MRI

J Imaging Inform Med. 2024 Sep 25. doi: 10.1007/s10278-024-01266-9. Online ahead of print.

Abstract

This work aims to perform a cross-site validation of automated segmentation for breast cancers in MRI and to compare the performance to radiologists. A three-dimensional (3D) U-Net was trained to segment cancers in dynamic contrast-enhanced axial MRIs using a large dataset from Site 1 (n = 15,266; 449 malignant and 14,817 benign). Performance was validated on site-specific test data from this and two additional sites, and common publicly available testing data. Four radiologists from each of the three clinical sites provided two-dimensional (2D) segmentations as ground truth. Segmentation performance did not differ between the network and radiologists on the test data from Sites 1 and 2 or the common public data (median Dice score Site 1, network 0.86 vs. radiologist 0.85, n = 114; Site 2, 0.91 vs. 0.91, n = 50; common: 0.93 vs. 0.90). For Site 3, an affine input layer was fine-tuned using segmentation labels, resulting in comparable performance between the network and radiologist (0.88 vs. 0.89, n = 42). Radiologist performance differed on the common test data, and the network numerically outperformed 11 of the 12 radiologists (median Dice: 0.85-0.94, n = 20). In conclusion, a deep network with a novel supervised harmonization technique matches radiologists' performance in MRI tumor segmentation across clinical sites. We make code and weights publicly available to promote reproducible AI in radiology.

Keywords: Breast cancer segmentation; Cross-site evaluation; Deep learning; Dynamic contrast enhancement; Harmonization; MRI.