Kimi K2.6 just beat Claude, GPT-5.5, and Gemini in a coding challenge
An open-weights model from China just crushed Western AI giants in a real-time puzzle challenge.
In a head-to-head coding challenge featuring ten AI models, Kimi K2.6—an open-weights model from Chinese startup Moonshot AI—emerged victorious with 22 match points and a 7-1-0 record. The contest, the Word Gem Puzzle, required bots to slide letter tiles on grids of varying sizes (10×10 to 30×30) to form valid English words of seven letters or more for positive scores. GPT-5.5 placed third, Claude Opus 4.7 fifth, and Gemini Pro 3.1 sixth. MiMo V2-Pro from Xiaomi took second place with a static scanning strategy, while DeepSeek V4 and Muse Spark finished near the bottom.
Kimi's winning approach relied on a greedy sliding algorithm that evaluated each possible move for potential new positive-value words, executing the best one immediately. When no positive word was accessible, it fell back to the first legal move alphabetically, causing inefficient edge-oscillation on smaller grids but paying off dramatically on larger, more scrambled boards. Its cumulative score of 77 was the highest in the tournament. In contrast, MiMo never slid tiles, instead scanning the initial grid for intact long words—a brittle strategy that failed on larger grids. Claude also avoided sliding, while GPT-5.5 performed conservatively with roughly 120 slides per round but couldn't match Kimi's volume and adaptability.
- Kimi K2.6 scored 22 match points (7-1-0), beating GPT-5.5 (third) and Claude Opus 4.7 (fifth).
- Its greedy sliding algorithm achieved the highest cumulative score (77) by aggressively moving tiles to form long words.
- MiMo V2-Pro placed second with a static scanning strategy that failed on larger grids, exposing the advantage of real-time tile manipulation.
Why It Matters
Open-source Chinese models are now competitive with top Western AI in specialized coding tasks.