Viral Wire

DeepMind CEO Demis Hassabis says solving Erdős problems isn't AGI

DeepMind's AI just solved 8 of the hardest combinatorial problems in mathematics, yet the company's CEO insists this isn't a step toward AGI — a position that reveals more about the limits of benchmarks than the capabilities of machines.

Deep Dive

DeepMind's latest achievement — solving 8 Erdős problems using a purpose-built AI system that searches through well-defined combinatorial spaces — should be a triumph. Yet CEO Demis Hassabis immediately set the record straight: this is not artificial general intelligence, nor is it a bridge to it. The distinction is critical. The problems tackled are finite, with clear solution spaces, and the AI's success stems from brute-force search or heuristics within constrained domains, not from inventing new mathematical concepts or generalizing across fields. Hassabis's framing is a deliberate effort to prevent hype, echoing his previous statements when DeepMind's AI mastered Go and StarCraft — narrow tasks that once seemed like milestones but now are recognized as isolated feats.

The AI landscape is increasingly shaped by how companies define progress. OpenAI's o1 reasoning model, for instance, excels at math and coding problems like AIME and Putnam by employing chain-of-thought reasoning across a broad range of tasks, positioning itself as a general-purpose reasoning engine. Anthropic's Claude focuses on safety and analytical rigor. DeepMind, by contrast, continues to target narrow but deep mathematical domains — AlphaGeometry and IMO silver medals are earlier examples. This specialization allows DeepMind to claim breakthroughs while simultaneously insulating itself from accusations of overpromising AGI. The company's parent Alphabet allocated over $12 billion to AI research in 2024, and the narrative discipline matters: investors and regulators alike watch for signs that AGI is near.

The backlash to Hassabis's caution was swift. NYU's Gary Marcus, a long-time AI skeptic, endorsed the view that solving Erdős problems is a far cry from open-ended innovation. Economist Noah Smith countered that mathematical reasoning is a core AGI capability and that the achievement merits celebration. This disagreement exposes a fundamental rift in how progress is evaluated. On one side, narrow benchmarks are seen as necessary but insufficient; on the other, they are evidence of a foundation being laid. The hidden risk is overinterpretation: solving 8 out of many Erdős problems using combinatorial search may not generalize to even slightly different problem types, and the compute cost and dataset design behind the system are opaque. DeepMind's strategic humility protects its reputation but also raises the bar for what counts as real progress — a move that pressures competitors to either match the honesty or double down on hype.

The bottom line is that AGI is not a single capability but a bundle of generalization, creativity, and cross-domain transfer. Benchmarks like Erdős problems reveal narrow competence, not generality. As investors and researchers look for signals, the smart bet is not on solving more puzzles but on systems that can invent their own puzzles — and then invent the tools to solve them.

Key Points
  • DeepMind's solution of 8 Erdős problems relies on combinatorial search, not open-ended invention, making it a narrow achievement despite its mathematical difficulty.
  • The debate between Gary Marcus and Noah Smith highlights a growing philosophical split: benchmarks vs. generalizability as measures of AGI progress.
  • Alphabet's $12B annual AI budget and OpenAI's $150B valuation depend on narrative management; companies that downplay narrow successes may gain credibility as AGI skeptics.

Why It Matters

The Erdős problem debate sharpens the definition of AGI, influencing investment, regulation, and research priorities across the industry.