Interpreting V1 Population Activity via Image-Neural Latent Representation Alignment
New AI framework reveals brain's visual cortex relies on low-level structure, not semantics.
Understanding how the primary visual cortex (V1) processes visual information has been a major challenge in neuroscience. While recent alignment-based methods have improved decoding of visual stimuli from brain activity, they offer limited insight into the underlying neural computations. To address this, researchers from multiple institutions introduce DINA (Dual-Tower Image-Neural Alignment), an interpretable contrastive framework that jointly trains a biologically motivated dual-tower architecture. DINA aligns images and population-level V1 responses in a shared latent space at the level of intermediate feature maps, enabling both accurate decoding and direct access to interpretable feature maps.
Evaluated on large-scale two-photon calcium imaging data from mouse V1, DINA achieves accurate neural-based decoding while revealing a critical insight: decoding performance is primarily supported by coarse, low-level visual structure—such as shape and texture cues—rather than semantic category information or fine-grained details. Further analysis shows that alignable feature maps emerge from multiple spatially distributed image regions and are predominantly reconstructed by sparse subsets of strongly responsive neurons and their functional interactions. These results confirm that DINA provides a principled framework for probing the computational mechanisms of visual processing in V1, moving beyond mere decoding to genuine interpretability.
- DINA uses a dual-tower contrastive architecture to align images and mouse V1 neural responses in a shared latent space.
- Decoding performance relies primarily on coarse, low-level visual structure (shape/texture) rather than high-level semantic categories.
- Feature maps are reconstructed by sparse subsets of strongly responsive neurons and their distributed interactions.
Why It Matters
Provides an interpretable framework to probe neural computations, advancing brain-machine interfaces and AI vision models.