Attention-guided erasing for enhanced transfer learning in breast abnormality classification

Int J Comput Assist Radiol Surg. 2025 Jan 15. doi: 10.1007/s11548-024-03317-6. Online ahead of print.

Abstract

Purpose: Breast cancer remains one of the most prevalent cancers globally, necessitating effective early screening and diagnosis. This study investigates the effectiveness and generalizability of our recently proposed data augmentation technique, attention-guided erasing (AGE), across various transfer learning classification tasks for breast abnormality classification in mammography.

Methods: AGE utilizes attention head visualizations from DINO self-supervised pretraining to weakly localize regions of interest (ROI) in images. These localizations are then used to stochastically erase non-essential background information from training images during transfer learning. Our research evaluates AGE across two image-level and three patch-level classification tasks. The image-level tasks involve breast density categorization in digital mammography (DM) and malignancy classification in contrast-enhanced mammography (CEM). Patch-level tasks include classifying calcifications and masses in scanned film mammography (SFM), as well as malignancy classification of ROIs in CEM.

Results: AGE significantly boosts classification performance with statistically significant improvements in mean F1-scores across four tasks compared to baselines. Specifically, for image-level classification of breast density in DM and malignancy in CEM, we achieve gains of 2% and 1.5%, respectively. Additionally, for patch-level classification of calcifications in SFM and CEM ROIs, gains of 0.4% and 0.6% are observed, respectively. However, marginal improvement is noted in the mass classification task, indicating the necessity for further optimization in tasks where critical features may be obscured by erasing techniques.

Conclusion: Our findings underscore the potential of AGE, a dataset- and task-specific augmentation strategy powered by self-supervised learning, to enhance the downstream classification performance of DL models, particularly involving ViTs, in medical imaging.

Keywords: Breast cancer; Data augmentation; Mammography; Self-supervised learning; Transfer learning.