Social agency theory challenges AI safety assumptions on planning origins
Human planning may be socially learned behaviors, not general algorithms.
Deep Dive
An essay argues that human agency doesn't emerge from low-level reflexes generalizing to a general planning algorithm, but rather consists of distinct socially learned behaviors. This challenges common AI safety models (MIRI, shard theory) and suggests planning is not a simple core of agency. The author claims introspective evidence about cognition is neglected and that inner misalignment concerns may be reduced since sophisticated reasoning is learned explicitly, not acquired inaccessibly.
Key Points
- Rejects the common model of agency as low-level to high-level generalization
- Claims human planning is socially learned, not a general algorithm
- Reduces worry about inner misalignment in AI systems
Why It Matters
If correct, AI alignment strategies based on bootstrapped general planning may be fundamentally misguided.