Accurate detection of surface defects on strip steel is essential for ensuring strip steel product quality. Existing deep learning based detectors for strip steel surface defects typically strive to iteratively refine and integrate the coarse outputs of the backbone network, enhancing the models' ability to express defect characteristics. Attention mechanisms including spatial attention, channel attention and self-attention are among the most prevalent techniques for feature extraction and fusion. This paper introduces an innovative triple-attention mechanism (TA), characterized by interrelated and complementary interactions, that concurrently and iteratively refines and integrates feature maps from three distinct perspectives, thereby enhancing the features' capacity for representation. The idea is from the following observation: given a three-dimensional feature map, we can examine the feature map from the three different yet interrelated two-dimensional planar perspectives: the channel-width, channel-height, and width-height perspectives. Based on the TA, a novel detector, called TADet, for the detection of steel strip surface defects is proposed, which is an encoder-decoder network: the decoder uses the proposed TA refines/fuses the multiscale rough features generated by the encoder (backbone network) from the three distinct perspectives (branches) and then integrates the purified feature maps from the three branches. Extensive experimental results show that TADet is superior to the state-of-the-art methods in terms of mean absolute error, S-measure, E-measure and F-measure, confirming the effectiveness and robustness of the proposed TADet. Our code and experimental results are available at https://github.com/hpguo1982/TADet .
Keywords: Defect detection; Feature fusion; Spatial attention; Triple attention.
© 2025. The Author(s).