Media & Culture

Anthropic’s Mythos breach was humiliating

Hackers accessed a 'too dangerous' AI model via a lucky guess

Deep Dive

Anthropic's tightly controlled rollout of Claude Mythos has taken an embarrassing turn. After weeks of insisting the AI model is so capable at cybersecurity that it's too dangerous for public release, a small group of unauthorized users gained access on the day Anthropic announced plans to offer it for testing. According to Bloomberg, the breach relied on an educated guess about the model's online location, using information exposed in a prior breach of Mercor (an AI training data company) and insider knowledge from contract work. Security researcher Lukasz Olejnik called it an 'entirely imaginable' failure that the cybersecurity industry has handled for 20 years, suggesting Anthropic should have anticipated it given the known Mercor breach.

The incident is particularly awkward for Anthropic, which built its brand on AI safety while touting Mythos's cybersecurity prowess. The company claims Mythos found vulnerabilities in every major OS and web browser, and its release was meant to be coordinated to reinforce global cyber defenses. Early reports from authorized testers, including Mozilla CTO Bobby Holley, confirm Mythos found hundreds of bugs in Firefox 150. However, Anthropic's failure to monitor access effectively—despite having logging capabilities—raises questions about its security posture. The group reportedly didn't use Mythos for cybersecurity tasks to avoid detection, a lucky break given the model's claimed dangers. Governments and financial institutions are eager for access, with the NSA reportedly involved, but CISA has been bypassed so far.

Key Points
  • Unauthorized users accessed Anthropic's Claude Mythos via an educated guess about its URL, using data from a Mercor breach and insider contract work.
  • Security researcher Lukasz Olejnik described the breach as an 'entirely imaginable' failure that Anthropic should have anticipated.
  • Anthropic had logging capabilities to detect the breach but failed to monitor effectively, despite Mythos being deemed too dangerous for public release.

Why It Matters

Undermines trust in AI safety claims and highlights systemic security gaps in high-stakes model rollouts.