AI Safety

Google's Deep Think Hits 84.6% on ARC-AGI, OpenAI Launches 1000 TPS Coding Model

LessWrong AI February 14, 2026

⚡Chinese open-source models now rival Claude Opus at 95% lower cost.

Deep Dive

Google's Gemini 3 Deep Think mode has reasserted frontier dominance, scoring 84.6% on the ARC-AGI-2 benchmark and reaching Olympiad gold-medal levels. Simultaneously, OpenAI launched GPT-5.3-Codex-Spark, a speed-optimized coding model running on Cerebras hardware that achieves over 1,000 tokens per second. The landscape is being reshaped by Chinese open-source models like MiniMax M2.5, which matches Opus 4.6 performance at just $1.20 per million tokens.

Why It Matters

This trifecta of breakthroughs massively accelerates agent capabilities while drastically lowering costs for developers.

Read Original Article

Google's Deep Think Hits 84.6% on ARC-AGI, OpenAI Launches 1000 TPS Coding Model

Why It Matters

Related Articles

🚀 Stay Ahead in AI