Regularizing Attention Scores with Bootstrapping
New statistical method filters noise from Vision Transformer attention maps, creating sparse, meaningful explanations.
A new research paper by Neo Christopher Chung and Maxim Laletin tackles a core problem in Vision Transformer (ViT) interpretability. ViTs use attention mechanisms to weigh the importance of different image patches, and these 'attention scores' are often used to explain the model's decisions. However, these scores are almost always non-zero, creating noisy, diffused attention maps that are difficult to interpret. The authors propose a statistical solution called 'Attention Regularization,' which treats the scores in a framework where independent noise leads to insignificant but non-zero values.
The technique leverages bootstrapping, a classic statistical method, to generate a baseline distribution of attention scores by repeatedly resampling the input features. This distribution is then used to estimate the statistical significance and posterior probabilities of each attention score. The result is a straightforward removal of spurious attention arising from noise, leading to drastically improved shrinkage and sparsity in the attention maps. The authors demonstrate the method's effectiveness on both natural and medical image datasets, showing it can filter out meaningless signals to reveal the truly important features a ViT is using.
This work, accepted at the Artificial Intelligence and Statistics (AISTATS) 2026 conference, highlights bootstrapping as a practical, statistically-grounded tool for regularizing attention scores. By providing a way to quantify uncertainty in these explanations, the approach moves beyond visual inspection towards more rigorous, trustworthy interpretability for complex vision models. The code is publicly available, allowing other researchers and practitioners to implement this regularization technique in their own work.
- Uses statistical bootstrapping to resample ViT inputs and create a baseline distribution for attention scores.
- Filters out noisy, non-zero scores to produce sparse, interpretable attention maps with quantifiable uncertainty.
- Demonstrated efficacy on medical and natural images, accepted for publication at AISTATS 2026.
Why It Matters
Enables more trustworthy AI explanations for critical fields like medical imaging by separating signal from noise in model decisions.