New mental rotation model uses VR data and neuro-symbolic AI
Researchers built an AI that replicates how humans mentally rotate 3D objects
Researchers from multiple institutions (including Raymond Khazoum, Daniela Fernandes, Aleksandr Krylov, Qin Li, and Stephane Deny) have developed a novel deep learning model that simulates human mental rotation—the ability to compare objects seen from different viewpoints. The model, accepted at ICML 2026, is built from three stacked components: (1) an equivariant neural encoder that creates 3D spatial representations from 2D images, (2) a neuro-symbolic object encoder that derives symbolic descriptions from those representations, and (3) a neural decision agent that compares symbolic descriptions and prescribes rotation simulations in 3D latent space via a recurrent pathway.
The team validated their model using VR experiments where participants could manipulate objects during comparison tasks. The model successfully captures human performance, response times, and behavioral patterns from both their own VR studies and existing experimental literature. Ablation studies confirmed that each component is essential. This work adds to a growing collection of deep neural models of human spatial reasoning, highlighting the effectiveness of combining deep learning, equivariant representations, and symbolic reasoning to model cognitive processes.
- Model uses an equivariant neural encoder to generate 3D spatial representations from 2D images
- Neuro-symbolic step extracts symbolic object descriptions for explicit comparison
- Validated against VR experiments with human participants and existing behavioral data
Why It Matters
This model bridges AI and cognitive science, advancing our understanding of human spatial reasoning and improving AI's ability to reason about 3D objects.