Cluster-CAM: Cluster-weighted visual interpretation of CNNs' decision in image classification

Neural Netw. 2024 Oct:178:106473. doi: 10.1016/j.neunet.2024.106473. Epub 2024 Jun 20.

Abstract

Despite the tremendous success of convolutional neural networks (CNNs) in computer vision, the mechanism of CNNs still lacks clear interpretation. Currently, class activation mapping (CAM), a famous visualization technique to interpret CNN's decision, has drawn increasing attention. Gradient-based CAMs are efficient, while the performance is heavily affected by gradient vanishing and exploding. In contrast, gradient-free CAMs can avoid computing gradients to produce more understandable results. However, they are quite time-consuming because hundreds of forward interference per image are required. In this paper, we proposed Cluster-CAM, an effective and efficient gradient-free CNN interpretation algorithm. Cluster-CAM can significantly reduce the times of forward propagation by splitting the feature maps into clusters. Furthermore, we propose an artful strategy to forge a cognition-base map and cognition-scissors from clustered feature maps. The final salience heatmap will be produced by merging the above cognition maps. Qualitative results conspicuously show that Cluster-CAM can produce heatmaps where the highlighted regions match the human's cognition more precisely than existing CAMs. The quantitative evaluation further demonstrates the superiority of Cluster-CAM in both effectiveness and efficiency.

Keywords: Class activation mapping; Clustering algorithm; Explainable artificial intelligence; Image classification.

MeSH terms

  • Algorithms*
  • Cluster Analysis
  • Cognition / physiology
  • Humans
  • Image Processing, Computer-Assisted / methods
  • Neural Networks, Computer*