AI Safety

The Claude Code Source Leak

Accidental npm package update reveals Claude's source code, sparking security and internal leak concerns.

Deep Dive

Anthropic, the AI safety company behind Claude, has suffered a significant source code leak. The incident occurred when an update to the Claude Code npm package inadvertently included extractable source code. While the core model weights—the proprietary, trained parameters that constitute the AI's intelligence—were not exposed, the leak reveals internal tooling, build scripts, and potential feature stubs. One notable discovery was a reference to an 'undercover mode' designed to suppress Claude's self-identification in commit messages, hinting at previously considered competitive tactics.

This leak marks at least the third security incident for Anthropic in recent weeks, following the leak of a memo during a Department of Defense (DoW) incident and a separate leak of approximately 3,000 internal assets, including information on the unreported 'Mythos' model testing. The timing, coinciding with CEO Dario Amodei's international travel, has fueled speculation about whether these are coincidental human errors, a sign of operational strain, or potential 'enemy action.' The company has begun issuing takedown notices to GitHub repositories hosting the code, though some notices have reportedly been misdirected at unrelated forks.

Key Points
  • Source code for Claude Code leaked via npm package, exposing internal tools and scripts but not the core AI model weights.
  • This is the third leak in weeks, following a DoD memo leak and a leak of 3,000 internal assets on 'Mythos' model testing.
  • Revealed a stubbed 'undercover mode' feature to hide AI identity in commits, raising questions about development practices and security.

Why It Matters

Repeated leaks undermine trust in AI lab security and could accelerate open-source replication, eroding competitive moats.