GLM-5.1 just sitting with Opus 4.6 on SWE-Bench Pro and it’s completely open. but costs Input $1.4 / Output $4.4
A new open-source AI model beats GPT-5.4 on coding tasks and runs 8-hour autonomous agent loops for free.
Zhipu AI's GLM-5.1, a 200-billion parameter open-source model, has achieved a significant milestone by scoring 58.4 on the rigorous SWE-Bench Pro benchmark, which tests an AI's ability to solve real-world software engineering problems. This score not only narrowly edges out Anthropic's flagship Claude Opus 4.6 (56.7) but also surpasses OpenAI's GPT-5.4, establishing a new performance bar for coding-focused AI. Crucially, GLM-5.1 is released under the fully permissive Apache 2.0 license, granting developers complete freedom to use, modify, and deploy it without restrictive terms or usage fees.
The model's most disruptive feature is its capability to run fully autonomous agent loops for up to 8 hours, enabling it to plan, code, test, and debug entire applications from start to finish without human intervention. This positions it as a direct competitor to expensive, token-based coding assistants like GitHub Copilot Enterprise. While self-hosting a 200B model requires significant hardware, the cost structure shifts from per-token API fees to a one-time infrastructure investment, making it potentially economical for teams with sustained, high-volume coding needs. This release is part of a growing trend of high-performance open-source models, primarily from Chinese labs like 01.AI and Zhipu, that are rapidly closing the gap with proprietary Western counterparts.
- GLM-5.1 scores 58.4 on SWE-Bench Pro, beating Claude Opus 4.6 (56.7) and GPT-5.4.
- Released under Apache 2.0 license, it enables 8-hour autonomous agent loops for end-to-end app development.
- Operational cost is primarily internet bandwidth, challenging the per-token pricing of OpenAI and Anthropic.
Why It Matters
This dramatically lowers the cost of advanced AI coding assistance, forcing a reevaluation of paid API services for development teams.