AI Safety

Can We Secure AI With Formal Methods? January-March 2026

LessWrong AI April 03, 2026

⚡An AI security expert issues a 'draft' for a red-blue agent army to harden critical software before AI finds 0-days.

Deep Dive

In a viral Substack post titled 'Can We Secure AI With Formal Methods?', AI security researcher Quinn declares 2026 the year the industry must pivot to 'secure program synthesis' (SPS) and formal methods for AI (FMxAI). The declaration comes amid an exponential rise in related blog posts and a stark warning from security expert Nicholas Carlini, who stated that Anthropic's Claude Opus model is now better at finding critical software vulnerabilities (0-days) than human experts. Quinn argues the field is currently 'offense dominated' and calls for a mobilization on the scale of Y2K preparation, which consumed 20-40% of corporate IT budgets, to harden software before AI-powered attacks become widespread.

To operationalize this, Quinn provides a concrete technical blueprint: building autonomous 'red-blue' agent loops. The proposed system would use one AI agent armed with red-team tools (like fuzzers and static analyzers) to find vulnerabilities in a code repository, and a second 'blue' agent to automatically patch them. The post includes a pseudo-code command urging readers to fork critical 'loadbearing' open-source projects, harden them with this AI-driven process, and benchmark the security improvements. This call to action is framed as a 'draft' for a decentralized 'secure program synthesis army' to proactively defend infrastructure, coinciding with initiatives like the UK AISI's formal call for information on securing AI compute infrastructure.

Key Points

Researcher declares 2026 the 'year of secure program synthesis' (SPS) following the 'year of the agent' in 2025, citing an explosion in related technical writing.
Cites Nicholas Carlini's warning that Claude Opus outperforms humans at finding critical software vulnerabilities (0-days), signaling an urgent, offense-dominated landscape.
Proposes a concrete solution: building autonomous AI 'red-blue' agent loops that continuously find and patch vulnerabilities in critical open-source software, modeled after Y2K-scale mobilization.

Why It Matters

As AI becomes capable of automating cyber attacks, proactively hardening software with AI-driven formal methods is a critical defense race for organizations.

Read Original Article

Can We Secure AI With Formal Methods? January-March 2026

Why It Matters

Stay Ahead in AI