Useful nonrobust features are ubiquitous in biomedical images
Deep networks use patterns invisible to humans, boosting accuracy but failing under shifts.
A new study accepted at The IEEE International Symposium on Biomedical Imaging (ISBI) 2026 reveals that deep networks for medical imaging learn 'useful nonrobust features'—input patterns that are predictive but not interpretable by humans and highly susceptible to small adversarial perturbations. The researchers, led by Coenraad Mouton, tested this across five MedMNIST classification tasks. They found that models trained exclusively on these nonrobust features still achieved well above chance accuracy, confirming their predictive value in standard (in-distribution) settings.
However, the study also uncovers a critical trade-off. Models trained with adversarial methods to rely on robust (interpretable) features sacrificed in-distribution accuracy but performed significantly better under controlled distribution shifts (tested via MedMNIST-C). This suggests that while nonrobust features boost standard test performance, they degrade out-of-distribution reliability. The authors argue that deployment settings for medical imaging AI must tailor the robustness-accuracy balance to specific clinical needs, especially where data shifts are common.
- Models trained only on nonrobust features achieve above-chance accuracy across 5 MedMNIST tasks.
- Adversarially trained models relying on robust features sacrifice in-distribution accuracy but improve out-of-distribution performance on MedMNIST-C.
- The study highlights a practical robustness-accuracy trade-off for medical imaging AI deployment.
Why It Matters
This trade-off means clinical AI must balance accuracy against reliability under real-world data shifts.