Study of 162 vision models finds universal representations that mirror the brain
Different AI vision models converge on the same conceptual features, just like primates do.
A new study from researchers at multiple institutions (Mahner et al., arXiv 2026) systematically analyzed 162 vision models — trained with different architectures, objectives, datasets, and sizes — to understand which visual representations are universal vs. model-specific. By decomposing each model's object similarity structure into non-negative dimensions, they identified patterns that reappear across almost all models. These universal dimensions are far more interpretable than model-specific ones, driven by conceptual and semantic image properties rather than low-level features. Surprisingly, differences in architecture, objective function, training data, or model performance do not explain why some dimensions become universal.
The most striking finding: models that exhibit more universal dimensions also better predict neural activity in macaque inferior temporal (IT) cortex and human similarity judgments. This suggests that convergence toward universal representations is not arbitrary — it reflects a deeper alignment with biological vision. The study implies that deep networks, despite their diverse training regimes, are converging on the same representational principles used by primate brains. For AI developers, this means that designing models that naturally yield these universal dimensions could improve both performance and biological plausibility, potentially leading to more efficient and interpretable vision systems.
- Decomposed object similarity for 162 vision models into universal vs. model-specific dimensions
- Universal dimensions are more interpretable and driven by semantic content, not architecture or data
- Models with more universal dimensions predict macaque IT activity and human similarity judgments better
Why It Matters
Shows AI vision models inherently converge on brain-like representations, guiding development of more interpretable and biologically aligned AI.