Seeing the imagined: a latent functional alignment in visual imagery decoding from fMRI data
A new AI method maps brain activity from imagination into a pretrained model's space, enabling above-chance reconstruction.
A research team led by Fabrizio Spera has published a breakthrough paper on arXiv titled 'Seeing the imagined: a latent functional alignment in visual imagery decoding from fMRI data.' The work tackles a major challenge in neuroAI: while models like DynaDiff excel at reconstructing perceived images from fMRI data (brain activity recorded while viewing pictures), their performance on mental imagery—reconstructing what a person imagines with their eyes closed—has been poor. The team's key innovation is a 'latent functional alignment' technique that acts as a smart adapter. Instead of retraining the entire, massive DynaDiff model from scratch, their method learns to map the unique brain activity patterns of imagination into the same 'conditioning space' the model was trained on for perception, keeping the core generative components frozen.
To overcome the scarcity of matched 'imagery-perception' data (where a person both sees and imagines the same thing), the researchers introduced a clever retrieval-based augmentation strategy. This finds semantically similar perception trials from the large Natural Scenes Dataset (NSD) to bolster training. Tested on the Imagery-NSD benchmark across four subjects, the new alignment method consistently outperformed both a frozen baseline and a traditional voxel-space ridge regression approach. Crucially, it achieved above-chance decoding from multiple cortical regions, suggesting it successfully leverages the rich semantic structure the model learned from perception to stabilize reconstructions of imagined content. This represents a significant step toward more general brain-computer interfaces that can interpret internal thought processes, not just sensory input.
- The method adapts the SOTA 'DynaDiff' visual perception decoder to work for mental imagery using a novel 'latent functional alignment' adapter.
- It uses a retrieval-based augmentation strategy to find related perception data, mitigating the limited supervised imagery data available for training.
- Tested on four subjects in the Imagery-NSD benchmark, it improved semantic metrics and enabled above-chance decoding from multiple brain regions.
Why It Matters
This advances brain-computer interfaces beyond decoding perception, moving toward systems that can interpret internal thoughts and imagination.