Qwen 3.5 Reportedly Beats GPT-5.2 and Gemini 3 Pro on Key Benchmark
A new benchmark shows a massive leap for Alibaba's open-source model.
Deep Dive
A new spatial reasoning benchmark, MineBench, shows Qwen 3.5 making an 'insane improvement' over its predecessor. The creator reports some Qwen 3.5 builds performed closer to, if not better than, top-tier closed models like Claude Opus 4.6, GPT-5.2 Pro, and Gemini 3 Pro. This suggests a dramatic narrowing of the performance gap between leading open and closed-source AI models in specific reasoning tasks.
Why It Matters
Open-source models may be catching up to the most advanced AI, potentially reshaping the competitive landscape.