What if LLMs are mostly crystallized intelligence?
A new analysis suggests scaling LLMs won't lead to AGI without fluid intelligence breakthroughs.
A recent LessWrong article by deep5th (linkposting Expected Surprise) argues that large language models are predominantly crystallized intelligence—excellent at pattern-matching from training data—rather than demonstrating true fluid intelligence (novel reasoning, adapting to unfamiliar problems). The author notes that LLMs ace the SAT and ARC-AGI benchmarks, yet remain mediocre at many other tasks that humans handle with ease. This jagged performance suggests their success comes from compressing behavioral patterns into neural weights, not from a deep world model.
If this analysis holds, scaling LLMs alone may not lead to omni-competent AGI. Training data is expected to run dry between 2026 and 2032, forcing labs to prioritize specialized data generation. The author warns that benchmarks like METR AI R&D may overestimate real-world progress because they measure closed-form tasks; open-ended, data-poor problems rely more on fluid intelligence, where LLMs are weak. Still, the author puts only 20% probability on this significantly slowing progress, acknowledging that new paradigms may emerge. The implications for AI safety are nuanced: slower takeoff could be positive, but labs may still achieve 10x speedups in R&D over years.
- LLMs show strong crystallized intelligence (pattern matching) but weak fluid intelligence (novel reasoning), as evidenced by high SAT scores yet poor performance on many other tasks.
- Training data may be exhausted by 2026–2032, leading to specialized data collection and jagged capabilities growth.
- Benchmarks like METR AI R&D may overestimate real-world progress because they measure closed-form tasks, not open-ended, data-poor problems requiring fluid intelligence.
Why It Matters
If LLMs plateau on fluid intelligence, AI progress may slow, giving more time for safety research and policy.