MIRAGE uses a linear backbone and multi-modal text+image features fed into a diffusion model to reconstruct mental imagery from fMRI?

MIRAGE uses a linear backbone and multi-modal text+image features fed into a diffusion model to reconstruct mental imagery from fMRI.

Achieves state-of-the-art performance on the NSD-Imagery benchmark, surpassing other vision decoders?

Achieves state-of-the-art performance on the NSD-Imagery benchmark, surpassing other vision decoders.

Ablation shows best results with low-dimensional image features and combined text and multi-level image guidance?

Ablation shows best results with low-dimensional image features and combined text and multi-level image guidance.

Research & Papers

MIRAGE model reconstructs mental imagery from fMRI with SOTA accuracy

arXiv q-bio.NC May 19, 2026

⚡Researchers decode your mind's eye using brain scans and diffusion models.

Deep Dive

In an analysis of the NSD-Imagery dataset, researchers found that while some modern vision decoders work well for mental image reconstruction, others fail—and top performance on seen images doesn't guarantee success on mental imagery. To address this, they developed MIRAGE (Robust multi-modal architectures translate fMRI-to-image models from vision to mental imagery). MIRAGE employs a linear backbone combined with multi-modal text and image features as input to a diffusion model. This architecture explicitly targets cross-decoding from brain activity to internally generated visual content.

Feature metrics and human raters confirm MIRAGE as state-of-the-art on the NSD-Imagery benchmark. Ablation analysis reveals that mental image reconstruction performs best when decoders use relatively low-dimensional image features and incorporate guidance from both text descriptions and high- and low-level image features. The work demonstrates that—given the right architecture—existing large-scale datasets collected using external visual stimuli can serve as effective training data for decoding mental images. This opens the door to practical applications in brain-computer interfaces and neuroscientific research.

Key Points

MIRAGE uses a linear backbone and multi-modal text+image features fed into a diffusion model to reconstruct mental imagery from fMRI.
Achieves state-of-the-art performance on the NSD-Imagery benchmark, surpassing other vision decoders.
Ablation shows best results with low-dimensional image features and combined text and multi-level image guidance.

Why It Matters

Decoding internal mental imagery from brain scans enables new brain-computer interfaces and deeper understanding of visual imagination.

Read Original Article

MIRAGE model reconstructs mental imagery from fMRI with SOTA accuracy

Why It Matters

Related Articles

🚀 Stay Ahead in AI