MiMo-V2.5-Pro - the actual best open-weights model
Open-weights model beats Kimi K2.6 in cost and speed, 88% Good win rate
Xiaomi's MiMo-V2.5-Pro has emerged as a top contender in a custom benchmark that tests AI reasoning through autonomous games of Blood on the Clocktower, a complex social deduction game similar to The Traitors. After Kimi K2.6 disrupted the leaderboard, MiMo-V2.5-Pro now matches it as a dominant player, but with far better efficiency. The model processes 183,639 tokens per game on average, costing $0.99 per game — less than half Kimi K2.6's $2.65 (580k tokens) and significantly cheaper than Claude Opus 4.6 at $3.76. Matches finish in 2-3 hours compared to Kimi's 10-15 hours, making MiMo-V2.5-Pro far more practical for real-world use. It also boasts a low 0.4% tool call error rate, adding to its reliability.
However, the model's win rate shows a stark imbalance: 88% as Good but only 48% as Evil, which holds it back from the absolute top spot. Notable gameplay examples reveal it capably reasons from other players' perspectives and makes clean deductions (e.g., identifying the Baron), but occasionally makes critical mistakes like expecting an evil role to self-reveal or a minion confessing its role. Despite this, MiMo-V2.5-Pro currently offers the best value among high-end open-weights models, combining competitive reasoning with significantly lower cost and latency. Its benchmarks suggest it could serve as a strong, affordable backbone for multi-agent AI systems requiring nuanced social reasoning.
- MiMo-V2.5-Pro costs $0.99/game (183k tokens) vs Kimi K2.6's $2.65 (580k tokens) and Claude Opus 4.6's $3.76.
- Matches complete in 2-3 hours, a dramatic improvement over Kimi K2.6's 10-15 hour games.
- Achieves 88% Good win rate but only 48% Evil win rate, with a low 0.4% tool call error rate.
Why It Matters
Open-weights model delivers top-tier multi-agent reasoning at a fraction of competitors' cost, unlocking practical AI for complex social tasks.