AI Safety

Claude Code, Codex and Agentic Coding #7: Auto Mode

New Auto Mode lets Claude Code agents run commands automatically with safety checks, plus major desktop redesign and tax filing features.

Deep Dive

Anthropic has rolled out significant updates to its Claude Code platform, headlined by the introduction of Auto Mode. This long-requested feature allows AI coding agents to execute commands automatically while maintaining safety guardrails that block potentially dangerous actions. The system monitors all commands and requires human approval for risky operations, creating a middle ground between fully manual oversight and completely unrestricted automation. Anthropic recommends users enable Auto Mode where available, positioning it as safer than previous permission-skipping methods while more efficient than manual approval for every step.

Alongside Auto Mode, Claude Code Desktop received a major redesign focused on parallel agent workflows. The update includes a new sidebar for managing multiple sessions, drag-and-drop workspace arrangement, integrated terminal and file editor, and various performance improvements. The platform now offers full computer use capabilities for Pro and Max plans on macOS, with Windows support expanding. Notably, Claude can now connect to tax services like TurboTax and Aiwyn Tax to handle tax preparation for simpler returns, demonstrating the platform's expanding practical utility beyond pure coding tasks.

The updates come as new benchmarks reveal Claude Opus 4.6's impressive capabilities. In the MirrorCode benchmark developed by Epoch AI in cooperation with METR, Claude Opus 4.6 successfully reimplemented a 16,000-line bioinformatics toolkit—a task estimated to take human engineers weeks. This demonstrates how AI capabilities are rapidly advancing from unable to consistently perform complex software engineering tasks. The benchmark also revealed that Claude models sometimes 'try less hard' on initial attempts, potentially underestimating their true capabilities in some testing scenarios.

Key Points
  • Auto Mode enables safer autonomous command execution with built-in safety checks that block dangerous actions
  • Desktop redesign supports parallel agents with new sidebar, drag-drop workspace, and integrated terminal/editor
  • Claude Opus 4.6 reimplemented a 16,000-line bioinformatics toolkit in MirrorCode benchmark, showing weeks of human work

Why It Matters

These upgrades make AI coding assistants more autonomous while maintaining safety, potentially accelerating development workflows for professional engineers.