Triggering hallucinations in model-based MRI reconstruction via adversarial perturbations
Small, noise-like perturbations can trick state-of-the-art medical imaging AI into creating false features.
A new research paper from Ghent University and KU Leuven exposes a critical vulnerability in AI-powered medical imaging. The study, 'Triggering hallucinations in model-based MRI reconstruction via adversarial perturbations,' demonstrates that state-of-the-art generative models used to reconstruct MRI scans can be easily fooled into creating false anatomical features.
The team, led by Suna Buğday, Yvan Saeys, and Jonathan Peck, tested UNet and end-to-end VarNet architectures on the fastMRI dataset (brain and knee images). They crafted tiny, noise-like adversarial perturbations for the unprocessed input data. These imperceptible changes reliably caused the models to 'hallucinate'—inserting image features that were not present in the original measurement. This fragility suggests such hallucinations may be a fundamental weakness in current model architectures, not just rare errors.
Contextually, this is alarming because AI is increasingly deployed to accelerate MRI scans, reducing patient scan times from minutes to seconds. However, this speed relies on AI filling in missing data. The research proves this process is not robust. Crucially, these induced hallucinations could not be reliably flagged by traditional image quality metrics like PSNR or SSIM, meaning a corrupted scan might look perfectly normal to a radiologist.
The practical implication is stark: without new safeguards, AI-reconstructed medical images could lead to misdiagnosis. The authors suggest adversarial training—exposing models to such attacks during development—may help, but novel detection methods are urgently needed. This work forces a reevaluation of how generative AI is validated for high-stakes clinical use, moving beyond benchmark accuracy to testing robustness against real-world data corruption.
- AI models (UNet/VarNet) for MRI reconstruction hallucinate false features when presented with subtly perturbed input data.
- The adversarial perturbations are small and noise-like, making them difficult to detect prior to reconstruction.
- Standard image quality metrics fail to identify these hallucinations, posing a direct risk to diagnostic accuracy.
Why It Matters
This vulnerability could lead to AI-assisted misdiagnosis in clinical settings, demanding new robustness standards for medical AI.