Sparse Efficiency vs. Superposition: The AI Architecture Tradeoff
Human brains use 10,000x less energy—but dense AI models win via superposition.
Hillz's post on LessWrong highlights a fundamental tradeoff in AI architecture: the brain-like sparse efficiency of mixture-of-experts versus the dense superposition that makes modern models so powerful. The human brain consumes only about 5 MWh over 28 years—roughly 10,000x more efficient than today's frontier training runs. Normal cognition relies on sparse, localized circuits, not firing everything at once. Techniques like mixture-of-experts are a step in that direction, but they risk weakening superposition, the ability of dense models to compress many rare, non-overlapping features into shared neurons (a key insight from Anthropic's interpretability research).
From a safety perspective, superposition makes interpretability research difficult—though Anthropic is making progress. Hillz suggests that more segmented architectures might naturally reduce superposition, making models easier to inspect, audit, and constrain. The central question: can we design a middle path that achieves dramatic efficiency gains while retaining enough shared representation to preserve model power—and potentially improving governability along the way?
- Human brain uses ~5 MWh over 28 years, 10,000x more efficient than modern AI training.
- Mixture-of-experts increases sparsity but may lose the compression benefits of superposition.
- Anthropic's work shows superposition enables rare feature sharing in dense models but complicates interpretability.
Why It Matters
This architecture tradeoff will define whether future AI can be both energy-efficient and safely interpretable.