Research & Papers

Directional Confusions Reveal Divergent Inductive Biases Through Rate-Distortion Geometry in Human and Machine Vision

arXiv q-bio.NC April 24, 2026

⚡AI and humans make different directional mistakes revealing hidden biases in vision models

Deep Dive

In a paper published on arXiv (2604.21909), researchers Leyla Roksan Caglar, Pedro A.M. Mediano, and Baihan Lin explore how humans and deep vision models diverge in their inductive biases through directional confusions. They tested matched responses on natural-image categorization under 12 perturbation types, quantifying asymmetry in confusion matrices. Using a Rate-Distortion (RD) framework, they derived three geometric signatures—slope (beta), curvature (kappa), and efficiency (AUC)—to characterize generalization geometry. Results show humans have broad but weak asymmetries, while deep models exhibit sparse, strong directional collapses. Robustness training reduces global asymmetry but fails to achieve human-like breadth-strength profiles. Mechanistic simulations confirm that different asymmetry organizations shift the RD frontier in opposite directions, even when performance is matched. This positions directional confusions and RD geometry as compact, interpretable signatures of inductive bias under distribution shift, offering new ways to evaluate and improve AI vision systems.

The study's implications extend to AI safety and interpretability, as it provides a systematic method to uncover hidden biases in vision models that accuracy alone misses. By revealing how models 'confuse' categories differently from humans, researchers can design more robust and human-aligned systems. The RD framework offers a principled way to measure these biases, potentially improving model training and evaluation. This work highlights the importance of going beyond top-1 accuracy to understand model behavior, especially in critical applications like autonomous driving or medical imaging where misclassification patterns matter.

Key Points

Humans show broad but weak directional confusions; deep vision models have sparse, strong directional collapses
Rate-Distortion framework yields three geometric signatures: slope, curvature, and efficiency
Robustness training reduces global asymmetry but fails to replicate human-like graded similarity profiles

Why It Matters

Reveals hidden biases in AI vision systems that accuracy alone misses, improving model evaluation and alignment with human perception.

Read Original Article

Directional Confusions Reveal Divergent Inductive Biases Through Rate-Distortion Geometry in Human and Machine Vision

Why It Matters

Stay Ahead in AI