Are Natural-Domain Foundation Models Effective for Accelerated Cardiac MRI Reconstruction?
CLIP and DINOv2 show surprising resilience in cross-domain MRI reconstruction...
A team from Dublin City University and University College Dublin investigated whether large-scale pretrained foundation models from the natural-image domain can serve as effective priors for accelerated cardiac MRI reconstruction, a challenging physics-based inverse problem. They propose an unrolled reconstruction framework that incorporates frozen visual encoders—specifically CLIP, DINOv2, and BiomedCLIP—within each cascade to guide the reconstruction process. This approach leverages the rich feature representations learned by these models during pretraining on massive natural-image datasets, hypothesizing that such priors can generalize beyond their original domain.
Through extensive experiments, the researchers found that while task-specific state-of-the-art models like E2E-VarNet achieve superior performance in standard in-distribution settings, foundation-model-based approaches remain competitive. More importantly, in challenging cross-domain scenarios—where models are trained on cardiac MRI and evaluated on anatomically distinct knee and brain datasets—foundation models exhibit improved robustness, particularly under high acceleration factors and limited low-frequency sampling. Natural-image-pretrained models like CLIP learned highly transferable structural representations, while domain-specific pretraining (BiomedCLIP) provided modest additional gains in more ill-posed regimes. These results suggest that pretrained foundation models offer a promising source of transferable priors for improving robustness and generalization in accelerated MRI reconstruction. The work was accepted to CVPRW 2026.
- Unrolled framework uses frozen CLIP, DINOv2, and BiomedCLIP encoders as priors for cardiac MRI reconstruction.
- Foundation models show superior robustness to task-specific models in cross-domain evaluation (cardiac→knee/brain) under high acceleration factors.
- CLIP learned highly transferable structural representations; BiomedCLIP provided modest gains in ill-posed regimes.
Why It Matters
Foundation models could make MRI reconstruction more robust across anatomies, reducing retraining needs in clinical settings.