AI Safety

A Guide to the Theory of Appropriateness Papers

LessWrong AI March 31, 2026

⚡A 100+ page framework models human cognition as pattern completion over cultural norms.

Deep Dive

A team led by DeepMind's Joel Z. Leibo has published a comprehensive guide to their 'Theory of Appropriateness' research sequence, which offers a novel computational framework for understanding human cognition and social dynamics. The core of the work is a substantial 100+ page paper from 2024 that posits human behavior is largely a process of 'pattern completion' over culturally learned expectations of what is 'appropriate' in a given context. This theory bridges cognitive science, economics, and social dynamics to model how norms guide action.

The research sequence has two main branches: one focused on human behavior, including papers on norms of rationality and models of status signaling, and another applying the framework to AI technology governance. Key applied papers include 'A Pragmatic View of AI Personhood' (2025) and a paper conceptualizing societal progress as sewing a 'patchy, polychrome quilt.' The work argues that for AI to be truly aligned and safe, its reasoning must share the 'type signature' of human practice-based logic, rather than being driven by brittle, goal-oriented programming.

The theory challenges conventional AI alignment approaches that focus on instilling fixed goals or rules. Instead, it suggests that properties like harmlessness and corrigibility are more naturally achieved by AI systems whose internal reasoning mirrors the fluid, context-sensitive, and norm-driven nature of human judgment. This represents a significant shift in thinking about how to build AI that can genuinely collaborate within human social structures.

Key Points

Core 100+ page paper (arXiv:2412.19010) models human behavior as 'pattern completion' over learned cultural norms.
Applies framework to AI governance, including a 2025 paper proposing a pragmatic view of AI personhood.
Argues AI safety requires reasoning based on human 'practices,' not rigid goals, for natural alignment.

Why It Matters

Provides a new framework for building AI that aligns with human social reasoning, crucial for safe collaboration.

Read Original Article

A Guide to the Theory of Appropriateness Papers

Why It Matters

Stay Ahead in AI