AI Safety

Claude Mythos #3: Capabilities and Additions

New analysis reveals Mythos crosses key thresholds, prompting White House action and Project Glasswing.

Deep Dive

New analysis of Anthropic's Claude Mythos reveals a significant threshold crossing in AI capabilities, particularly in cybersecurity. According to Zvi's detailed examination on LessWrong, Mythos breaks Anthropic's previous trendline on the Epoch Capabilities Index (ECI) - an amalgamation of benchmarks using item response theory. The model's cybersecurity capabilities are described as 'quite scary,' prompting coordinated safety responses including Project Glasswing and White House involvement. While other capabilities don't appear equally alarming, the analysis suggests we've crossed a key threshold where AI-assisted cyber capabilities demand serious defensive measures.

Anthropic attributes Mythos's gains to human research breakthroughs rather than AI assistance, having interviewed researchers to confirm advances were made without significant help from earlier-generation models. The model represents both gains over time and benefits from increased size, with external analysis showing Claude moving from substantially below OpenAI models to being narrowly ahead. The White House is reportedly racing to address potential threats, with key players now cooperating on safety measures. The analysis concludes that while Mythos isn't a complete trend break when accounting for renewed ability to increase model size, that size increase capability itself represents a significant shift in what's possible.

Key Points
  • Mythos breaks Anthropic's trendline on Epoch Capabilities Index (ECI), moving Claude from below to narrowly ahead of OpenAI models
  • Cybersecurity capabilities described as 'quite scary,' triggering Project Glasswing safety response and White House coordination
  • Anthropic attributes gains to human research breakthroughs, not AI assistance, with researchers confirming advances made without significant model help

Why It Matters

Marks a threshold where AI cybersecurity capabilities demand coordinated safety responses and could accelerate regulatory action.