Mythos achieves 72.4% exploit success rate from browser crashes, up from 0.9% with Claude Opus 4.6?

Mythos achieves 72.4% exploit success rate from browser crashes, up from 0.9% with Claude Opus 4.6

Model restricted to 50+ organizations via Project Glasswing despite passing internal safety framework?

Model restricted to 50+ organizations via Project Glasswing despite passing internal safety framework

Anthropic describes target scope as "small-scale enterprise networks with weak security posture"?

Anthropic describes target scope as "small-scale enterprise networks with weak security posture"

AI Safety

Anthropic's Mythos AI creates working exploits from browser crashes 72% of the time

LessWrong AI April 11, 2026

⚡New model achieves 72.4% exploit success rate, up from 0.9% with previous best model.

Deep Dive

Anthropic has unveiled Mythos, a frontier AI model with unprecedented cybersecurity capabilities that can analyze browser crashes and develop working computer exploits with 72.4% success rate. This represents an 80x improvement over their previous best model, Claude Opus 4.6, which achieved only 0.9% on the same benchmark developed with Mozilla using real Firefox vulnerabilities. The model's performance is considered a floor, with Anthropic noting it would likely perform even better with more tokens and that scaffolding improvements typically boost capabilities over time.

Despite passing their Responsible Scaling Policy framework, Anthropic has chosen to restrict Mythos through Project Glasswing, a limited deployment to approximately 50 critical infrastructure organizations across three major cloud providers. The company acknowledges the model could handle "small-scale enterprise networks with weak security posture" - representing a significant portion of the internet. For these trusted partners, Anthropic is not blocking exchanges based on classifier triggers, allowing cybersecurity defenders to leverage the model's full capabilities for defense purposes, a significant departure from their general-release model restrictions.

Key Points

Mythos achieves 72.4% exploit success rate from browser crashes, up from 0.9% with Claude Opus 4.6
Model restricted to 50+ organizations via Project Glasswing despite passing internal safety framework
Anthropic describes target scope as "small-scale enterprise networks with weak security posture"

Why It Matters

Demonstrates AI's rapidly advancing offensive cybersecurity capabilities while raising questions about responsible deployment of dangerous technology.

Read Original Article

Anthropic's Mythos AI creates working exploits from browser crashes 72% of the time

Why It Matters

Related Articles

🚀 Stay Ahead in AI