Research & Papers

New Study: Supervised Training Destroys Neural Networks' Brain-Like V1 Representations

Random, untrained networks match or beat trained models at mimicking early visual cortex.

Deep Dive

A new study by Nils Leutenegger, published on arXiv (ID: 2605.30556), reveals a startling finding: untrained, randomly initialized neural networks consistently match or outperform trained networks in representational similarity to the early visual cortex (V1). Using representational similarity analysis (RSA) against human fMRI data from three subjects across six visual ROIs, the study tracked alignment over 8 training checkpoints (epochs 0–40) with 720 object images from the THINGS database. Four biologically plausible learning rules were compared: backpropagation (BP), feedback alignment (FA), predictive coding (PC), and spike-timing-dependent plasticity (STDP). The results show that even a single epoch of supervised training reduces V1 alignment by 25–90%, with backpropagation causing the most severe degradation (Δr = -0.080). In contrast, local learning rules like predictive coding and STDP preserve substantially more brain-like structure (Δr ≈ -0.04). A weaker opposite trend appears in object-selective cortex (LOC), where backpropagation shows the largest increase in alignment during training, though the absolute change remains small.

These findings challenge the core assumption that supervised learning enhances neural networks' biological plausibility. Instead, they suggest that untrained architectures already capture low-level visual statistics purely through inductive biases—the built-in structure of convolutional layers and pooling. Global error signals from backpropagation aggressively reshape early representations, overriding these natural biases, while local learning rules like PC and STDP better preserve the brain-like organization. The study has significant implications for AI research aiming to build more brain-like models: it may be preferable to use local learning rules or even refrain from supervised training on early layers to maintain biological fidelity. This work also opens new questions about the role of training in deeper visual areas and whether unsupervised or self-supervised methods might avoid this degradation.

Key Points
  • Single epoch of supervised training reduces V1 alignment by 25–90% across four learning rules.
  • Backpropagation causes worst degradation (Δr = -0.080), while predictive coding and STDP preserve more (Δr ≈ -0.04).
  • Opposite small trend in object-selective cortex (LOC): backprop shows slight alignment increase, but absolute change is minimal.

Why It Matters

Challenges AI's core assumption that supervised learning improves biological plausibility—may reshape training strategies for brain-like models.