Quantifying interpretation reproducibility in Vision Transformer models with TAVAC

Yue Zhao; Dylan Agyemang; Yang Liu; Matt Mahoney; Sheng Li

doi:10.1126/sciadv.abg0264

Quantifying interpretation reproducibility in Vision Transformer models with TAVAC

Sci Adv. 2024 Dec 20;10(51):eabg0264. doi: 10.1126/sciadv.abg0264. Epub 2024 Dec 20.

Authors

Yue Zhao¹, Dylan Agyemang², Yang Liu¹, Matt Mahoney³, Sheng Li^{1

4

5}

Affiliations

¹ The Jackson Laboratory for Genomic Medicine, Farmington, CT, USA.
² Department of Mathematics and Statistics, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA.
³ The Jackson Laboratory for Mouse Genetics, Bar Harbor, ME, USA.
⁴ The Jackson Laboratory Cancer Center, Bar Harbor, ME, USA.
⁵ Department of Biochemistry and Molecular Medicine, Department of Translational Genomics, Norris Comprehensive Cancer Center, Keck School of Medicine, University of Southern California, Los Angeles, CA, USA.

Abstract

Deep learning algorithms can extract meaningful diagnostic features from biomedical images, promising improved patient care in digital pathology. Vision Transformer (ViT) models capture long-range spatial relationships and offer robust prediction power and better interpretability for image classification tasks than convolutional neural network models. However, limited annotated biomedical imaging datasets can cause ViT models to overfit, leading to false predictions due to random noise. To address this, we introduce Training Attention and Validation Attention Consistency (TAVAC), a metric for evaluating ViT model overfitting and quantifying interpretation reproducibility. By comparing high-attention regions between training and testing, we tested TAVAC on four public image classification datasets and two independent breast cancer histological image datasets. Overfitted models showed significantly lower TAVAC scores. TAVAC also distinguishes off-target from on-target attentions and measures interpretation generalization at a fine-grained cellular level. Beyond diagnostics, TAVAC enhances interpretative reproducibility in basic research, revealing critical spatial patterns and cellular structures of biomedical and other general nonbiomedical images.

MeSH terms

Algorithms
Breast Neoplasms / diagnosis
Breast Neoplasms / diagnostic imaging
Breast Neoplasms / pathology
Deep Learning*
Female
Humans
Image Interpretation, Computer-Assisted / methods
Image Processing, Computer-Assisted / methods
Neural Networks, Computer
Reproducibility of Results