Exploring Human Behavior During Abstract Rule Inference and Problem Solving with the Cognitive Abstraction and Reasoning Corpus
New research reveals how humans solve abstract visual puzzles that stump today's best AI models.
A team of researchers from Boston University and UC Santa Barbara has published a groundbreaking study analyzing how humans tackle the same abstract reasoning problems that challenge today's most advanced AI systems. The paper, 'Exploring Human Behavior During Abstract Rule Inference and Problem Solving with the Cognitive Abstraction and Reasoning Corpus,' introduces CogARC—a human-adapted subset of the famous Abstraction and Reasoning Corpus (ARC) benchmark. While ARC was created to test AI's ability to infer rules from minimal examples, this study flips the script by administering 75 of these visual puzzles to 260 human participants, recording their problem-solving behavior at unprecedented temporal resolution. The findings reveal that humans demonstrate remarkable flexibility, achieving success rates between 80-90% across experiments, far surpassing current AI performance on the same tasks.
The research provides crucial insights into the cognitive strategies behind human abstract reasoning. Participants' behavior was tracked through example viewing patterns, edit sequences, and multi-attempt submissions, showing that harder problems elicited longer deliberation times and greater divergence in solution strategies. Interestingly, while participants initiated responses more quickly over time, accuracy showed a slight decline, suggesting increased task familiarity rather than improved rule-learning. Even incorrect solutions often converged on similar answers, and problem-solving trajectories varied from direct efficient paths to extended explorations with partial restarts. This rich behavioral dataset, now publicly available via arXiv, offers AI researchers a new roadmap for building systems that can mimic human-like generalization, error patterns, and adaptive reasoning under uncertainty.
- Humans achieved 80-90% accuracy on 75 abstract visual reasoning problems from the ARC benchmark, far exceeding current AI capabilities.
- The study tracked 260 participants' problem-solving at high resolution, revealing how strategy diverges on harder problems and how even incorrect answers converge.
- CogARC creates a rich behavioral dataset to train AI systems that better mimic human generalization and adaptive reasoning.
Why It Matters
This research provides a blueprint for building AI that can reason and adapt like humans, potentially closing a major gap in artificial general intelligence.