Chollet argues real AGI shouldn’t need human handholding on new tasks
Google AI researcher argues true AGI must master novel tasks without any human examples or instructions.
In a widely discussed position, Google AI researcher François Chollet has issued a fundamental challenge to how the tech industry defines and measures Artificial General Intelligence (AGI). Chollet argues that the current generation of large language models (LLMs) like OpenAI's GPT-4, Anthropic's Claude 3, and Google's own Gemini, despite their impressive capabilities, fail a core test of generality. Their performance is critically dependent on "human-in-the-loop" scaffolding, including detailed prompting, few-shot examples, and task-specific fine-tuning. For Chollet, a system that cannot autonomously acquire skills for a novel task—defined as one outside its training distribution—without this handholding does not qualify as AGI.
Chollet's critique centers on the concept of "skill-acquisition efficiency," which he previously formalized in his ARC (Abstraction and Reasoning Corpus) benchmark. He posits that real intelligence is characterized by the efficiency of turning a small amount of experience or information into a robust new skill. Current LLMs, he suggests, are more like vast libraries of compressed human knowledge that require a librarian (the human prompter) to retrieve the right information, rather than independent problem-solvers. This perspective shifts the AGI goalpost from simply achieving high scores on existing benchmarks to building systems with innate, meta-learning abilities that can generalize from first principles.
The implications are significant for AI development roadmaps and benchmarking. If Chollet's definition is accepted, it suggests that simply scaling up model size and training data may not lead to AGI, but instead requires architectural breakthroughs in reasoning and learning algorithms. It also provides a more rigorous, testable standard for AGI claims, moving beyond vague marketing toward a measurable capability: autonomous task mastery without human guidance.
- Chollet defines true AGI by "skill-acquisition efficiency," the ability to master novel tasks without human examples or instructions.
- He argues current LLMs (GPT-4, Claude, Gemini) fail this test, as they rely on prompting and fine-tuning for new tasks.
- This redefinition challenges scaling-as-path-to-AGI narratives and emphasizes the need for new meta-learning architectures.
Why It Matters
It sets a rigorous, measurable standard for AGI, shifting focus from benchmark performance to autonomous problem-solving, guiding future R&D.