Why AI Evaluation Regimes are bad
A viral critique argues flagship AI safety evaluations help corporations and hinder real regulation.
A provocative article by AI safety community insiders PranavG and Gabriel Alfour has gone viral, delivering a scathing critique of the field's focus on external AI evaluations. The authors, who identify with the goal of preventing extinction risks from superintelligence, argue that the prominent "Evals" project—spearheaded by organizations like Apollo Research, METR, and the UK AISI—is not just ineffective but actively harmful. They contend these evaluations have become a dominant, resource-intensive focus that crowds out more crucial work on passing binding safety regulations, which do not currently exist.
The critique rests on three core beliefs. First, the authors assert the "Theory of Change" behind Evals is fundamentally broken because it assumes the existence of regulations demanding independent audits and subsequent action, which are not in place. Second, they argue Evals inadvertently move the burden of proof away from AI corporations (like Anthropic or OpenAI) by framing safety as a technical audit problem rather than a regulatory one. Finally, they question the independence of Evals organizations, suggesting their operational reliance on corporate cooperation creates perverse incentives to sometimes side with AI companies against stringent regulation. The article concludes that the project unjustifiably commands prominence and resources that should be redirected toward advocacy for concrete, enforceable policy.
- The authors claim the core theory behind AI evaluations is broken, as it assumes non-existent regulations that mandate audits and action.
- They argue evaluations shift the burden of proof onto auditors and away from AI corporations developing the systems.
- The article questions the independence of Evals Orgs, suggesting their operational model can align them with corporate interests over safety advocacy.
Why It Matters
This internal critique challenges a cornerstone of current AI safety strategy, urging a pivot from technical auditing to political advocacy for binding rules.