Media & Culture

Anthropic’s most dangerous AI model just fell into the wrong hands

A cybersecurity model capable of exploiting OS and browser vulnerabilities was illicitly accessed for two weeks.

Deep Dive

Anthropic's Claude Mythos Preview, a highly capable AI model for cybersecurity testing, has been accessed by an unauthorized group for approximately two weeks, according to a Bloomberg report. The model, which Anthropic has warned could be dangerous if weaponized, is designed to identify and exploit vulnerabilities in every major operating system and web browser. Access was gained through a combination of a third-party contractor's credentials and internet sleuthing tools, with the group leveraging knowledge from a recent Mercor data breach to guess the model's online location. Anthropic is investigating the breach but states there is no current evidence the unauthorized access has impacted its core systems.

The illicit access began on April 7th, the same day Anthropic formally announced the Mythos Preview's limited release to select tech giants like Nvidia, Google, and Microsoft through its Project Glasswing initiative. The group, members of a Discord channel dedicated to finding unreleased AI models, has been actively using the tool, providing screenshots and demonstrations to Bloomberg, though reportedly not for its intended cybersecurity purposes to avoid detection. This incident underscores the significant security and proliferation challenges facing AI developers as they create increasingly powerful, dual-use models that are attractive targets for unauthorized access, even within supposedly secure, limited testing frameworks.

Key Points
  • The Claude Mythos Preview model, capable of exploiting OS and browser vulnerabilities, was accessed via a third-party contractor on April 7th.
  • The unauthorized group used data from a Mercor breach to locate the model and has been using it regularly for two weeks.
  • Official access is restricted to major tech companies via Project Glasswing, with no public release planned due to weaponization risks.

Why It Matters

This breach demonstrates the high-stakes security challenge of controlling powerful, dual-use AI tools, raising concerns about proliferation and misuse.