Open Source

Qwen 3.5 goes bankrupt on Vending-Bench 2

A top AI model just bombed a major new test, scoring a perfect zero.

Deep Dive

The new Qwen 3.5 model has reportedly scored 0% on the newly released Vending-Bench 2 benchmark, a catastrophic failure for a model from a major lab. The viral post highlights a stunning performance collapse, raising immediate questions about the model's real-world capabilities and the validity of its other claimed results. This benchmark flop is sparking intense debate about testing standards and whether some labs are 'overfitting' to popular benchmarks.

Why It Matters

This failure undermines trust in published AI results and forces a reevaluation of how we truly measure model intelligence.