The purpose of this study is to investigate the robustness of a commonly used convolutional neural network for image segmentation with respect to nearly unnoticeable adversarial perturbations, and suggest new methods to make these networks more robust to such perturbations. In this retrospective study, the accuracy of brain tumor segmentation was studied in subjects with low- and high-grade gliomas. Two representative UNets were implemented to segment four different MR series (T1-weighted, post-contrast T1-weighted, T2-weighted, and T2-weighted FLAIR) into four pixelwise labels (Gd-enhancing tumor, peritumoral edema, necrotic and non-enhancing tumor, and background). We developed attack strategies based on the fast gradient sign method (FGSM), iterative FGSM (i-FGSM), and targeted iterative FGSM (ti-FGSM) to produce effective but imperceptible attacks. Additionally, we explored the effectiveness of distillation and adversarial training via data augmentation to counteract these adversarial attacks. Robustness was measured by comparing the Dice coefficients for the attacks using Wilcoxon signed-rank tests. The experimental results show that attacks based on FGSM, i-FGSM, and ti-FGSM were effective in reducing the quality of image segmentation by up to 65% in the Dice coefficient. For attack defenses, distillation performed significantly better than adversarial training approaches. However, all defense approaches performed worse compared to unperturbed test images. Therefore, segmentation networks can be adversely affected by targeted attacks that introduce visually minor (and potentially undetectable) modifications to existing images. With an increasing interest in applying deep learning techniques to medical imaging data, it is important to quantify the ramifications of adversarial inputs (either intentional or unintentional).
Keywords: Adversarial attacks; Deep learning segmentation; Defenses; Robustness.
© 2021. Society for Imaging Informatics in Medicine.