Google upgraded Gemini-3 DeepThink: Advancing science, research and engineering
Google's new model just crushed OpenAI's best on science and coding tests.
Deep Dive
Google's Gemini 3 DeepThink model has reportedly set a new standard, scoring 48.4% on the challenging 'Humanity's Last Exam' benchmark without using tools. It also achieved an unprecedented 84.6% on ARC-AGI-2 and a staggering 3455 Elo on Codeforces, reaching gold-medal level on the International Math Olympiad. These results suggest a significant leap in reasoning and problem-solving capabilities for science, research, and engineering tasks.
Why It Matters
This leap in reasoning could accelerate scientific discovery and reshape the competitive landscape against OpenAI.