LessWrong's Raymond Douglas: Don't Confuse Selective with Predictive Optimization in AI
Selective processes mislead AI alignment—here's why predictive generalizes better.
Raymond Douglas's essay on LessWrong draws a critical distinction between selective and predictive optimization processes, warning that conflating the two can lead to dangerous misunderstandings—especially in AI alignment. Selective optimization, like evolution or simple gradient descent on Atari games, produces entities that achieve an outcome without intending it and fail to generalize beyond the training distribution. Predictive optimization, such as AlphaZero training a policy on its own rollouts, explicitly models outcomes and generalizes more robustly, though it often emerges from earlier selective processes.
Douglas provides numerous examples: bacteria developing antibiotic resistance (selective) vs. a hacker finding a vulnerability (predictive); humans evolving to lie (selective) vs. humans genetically modifying crops (predictive). He warns that missing selective optimization means underrating the computational search involved and misunderstanding an entity's true capabilities. The piece is a framework for AI safety researchers to better evaluate how systems achieve their goals, emphasizing that ‘Chesterton's Fence’ applies—don't scrap traditions without estimating the selective optimization that produced them. He also touches on alignment failure modes: reward is not the optimization target, and humans are ‘adaptation executors’ not fitness maximizers, which parallels challenges in AI alignment.
- Selective optimization (evolution, gradient descent) generalizes poorly; predictive optimization (AlphaZero) generalizes better.
- Misinterpreting selective as predictive can lead to dangerous assumptions about intent and capability.
- Douglas analogizes to 'Chesterton's Fence'—don't discard traditions without estimating the selective optimization behind them.
Why It Matters
Understanding this distinction is essential for AI alignment and avoiding overconfident predictions about system behavior.