Media & Culture

Mythos AI achieves 18/41 n-day exploits, crushing prior version's 1/41 and open-source models

Proprietary Mythos model outperforms its predecessor by 18x in hacking benchmarks.

Deep Dive

An AI model scored 18 out of 41 n-day exploits, compared to 1 out of 41 for its predecessor. Open-source and publicly weighted models scored zero. This demonstrates the model’s autonomous exploitation of known vulnerabilities, a critical skill for penetration testing and security research, and highlights a growing capability gap between proprietary and open-source AI in offensive security.

Key Points
  • Mythos achieved 18 out of 41 n-day exploits, compared to just 1/41 for the previous version (5.5).
  • Open-source models with publicly available weights scored zero exploits on the same test set.
  • The 18x improvement indicates a major advance in AI's ability to autonomously exploit known vulnerabilities.

Why It Matters

Proprietary AI models are rapidly outpacing open-source alternatives in offensive security, raising access and ethics concerns.