AI Safety

Richard Ngo's 'belief webs' model unifies agents' beliefs, goals, actions

Introduces a framework where actions are beliefs and global inconsistency is allowed.

Deep Dive

Richard Ngo's new post on LessWrong, 'Agents as Webs of Beliefs,' sketches an informal model that rethinks how intelligent agents represent and process information. The core premise is that an agent's beliefs are typically locally consistent with nearby beliefs but not globally consistent with all its beliefs—a departure from frameworks like causal graphs or active inference that assume a single probability distribution. To handle this global inconsistency, Ngo draws on probabilistic dependency graphs (PDGs) and Garrabrant induction, which use hyperedges and traders respectively to impose local constraints. He then argues that having exactly two layers (base beliefs and constraints) is artificial, pointing to hierarchical concept formation in predictive processing and deep learning as a more natural structure. The challenge: high-level concepts don't have binary truth values, but he sees promise in fleshing out properties of belief webs.

Ngo also explores the idea that actions are beliefs, building on Abram Demski's FixDT. In many real-world scenarios—especially social interactions—an agent's thoughts (beliefs) can directly affect outcomes, not just via external actions. This blurs the line between epistemic processes and decision-making, suggesting that decision theory must account for how beliefs themselves shape the world. By treating beliefs, goals, and actions as three facets of a single phenomenon, the belief webs framework offers a unified lens for AI alignment, agent foundations, and cognitive science, though it remains a rough sketch awaiting further formalization.

Key Points
  • Belief webs allow local consistency with global inconsistency, using PDGs and Garrabrant induction as formal tools.
  • Hierarchical concept formation from active inference is preferred over two-layer structures like hyperedges or traders.
  • Actions are treated as beliefs that can directly influence the world, referencing Abram Demski's FixDT for decision theory implications.

Why It Matters

Challenges traditional agent models, offering a unified view for AI alignment and reasoning under real-world inconsistency.

📬 Get the top 10 AI stories daily