An information-theoretic perspective of physical adversarial patches

Neural Netw. 2024 Nov:179:106590. doi: 10.1016/j.neunet.2024.106590. Epub 2024 Aug 3.

Abstract

Real-world adversarial patches were shown to be successful in compromising state-of-the-art models in various computer vision applications. Most existing defenses rely on analyzing input or feature level gradients to detect the patch. However, these methods have been compromised by recent GAN-based attacks that generate naturalistic patches. In this paper, we propose a new perspective to defend against adversarial patches based on the entropy carried by the input, rather than on its saliency. We present Jedi, a new defense against adversarial patches that tackles the patch localization problem from an information theory perspective; leveraging the high entropy of adversarial patches to identify potential patch zones, and using an autoencoder to complete patch regions from high entropy kernels. Jedi achieves high-precision adversarial patch localization and removal, detecting on average 90% of adversarial patches across different benchmarks, and recovering up to 94% of successful patch attacks. Since Jedi relies on an input entropy analysis, it is model-agnostic, and can be applied to off-the-shelf models without changes to the training or inference of the models. Moreover, we propose a comprehensive qualitative analysis that investigates the cases where Jedi fails, comparatively with related methods. Interestingly, we find a significant core failure cases among the different defenses share one common property: high entropy. We think that this work offers a new perspective to understand the adversarial effect under physical-world settings. We also leverage these findings to enhance Jedi's handling of entropy outliers by introducing Adaptive Jedi, which boosts performance by up to 9% in challenging images.

Keywords: Adversarial patches; Computer vision; Convolutional neural networks; Entropy.

MeSH terms

  • Algorithms
  • Entropy*
  • Humans
  • Information Theory*
  • Neural Networks, Computer