GradToken: Decoupling tokens with class-aware gradient for visual explanation of Transformer network

Lin Cheng; Yanjie Liang; Yang Lu; Yiu-Ming Cheung

doi:10.1016/j.neunet.2024.106837

GradToken: Decoupling tokens with class-aware gradient for visual explanation of Transformer network

Neural Netw. 2024 Nov 1:181:106837. doi: 10.1016/j.neunet.2024.106837. Online ahead of print.

Authors

Lin Cheng¹, Yanjie Liang², Yang Lu³, Yiu-Ming Cheung⁴

Affiliations

¹ Fujian Key Laboratory of Sensing and Computing for Smart City, School of Informatics, Xiamen University, Xiamen 361005, China. Electronic address: [email protected].
² Peng Cheng Laboratory, Shenzhen 518000, China. Electronic address: [email protected].
³ Fujian Key Laboratory of Sensing and Computing for Smart City, School of Informatics, Xiamen University, Xiamen 361005, China. Electronic address: [email protected].
⁴ Department of Computer Science, Hong Kong Baptist University, Hong Kong Special Administrative Region of China. Electronic address: [email protected].

PMID: 39536602
DOI: 10.1016/j.neunet.2024.106837

Abstract

Transformer networks have been widely used in the fields of computer vision, natural language processing, graph-structured data analysis, etc. Subsequently, explanations of Transformer play a key role in helping humans understand and analyze its decision-making and working mechanism, thereby improving the trustworthiness in its real-world applications. However, it is difficult to apply the existing explanation methods for convolutional neural networks to Transformer networks, due to the significant differences between their structures. How to design a specific and effective explanation method for Transformer poses a challenge in the explanation area. To address this challenge, we first analyze the semantic coupling problem of attention weight matrices in Transformer, which puts obstacles in providing distinctive explanations for different categories of targets. Then, we propose a gradient-decoupling-based token relevance method (i.e., GradToken) for the visual explanation of Transformer's predictions. GradToken exploits the class-aware gradient to decouple the tangled semantics in the class token to the semantics corresponding to each category. GradToken further leverages the relations between the class token and spatial tokens to generate relevance maps. As a result, the visual explanation results generated by GradToken can effectively focus on the regions of selected targets. Extensive quantitative and qualitative experiments are conducted to verify the validity and reliability of the proposed method.

Keywords: Explanation; Interpretability; Transformer; Visualization.