I just realised how good GLM 5 is
A developer spending 12B tokens on Claude finds GLM 5 superior for building a real-time chat app.
A developer with heavy Claude usage, having consumed over 12 billion tokens in recent months, conducted a comparative test between Claude Code (powered by Opus 4.6) and Zhipu AI's GLM 5, accessed through the OpenCode platform's Zen plan. The initial test on a simple dashboard inventory tracker showed Claude with a slight edge. However, the benchmark shifted dramatically with a more complex task: building a real-time chat application using WebSocket technology.
For the real-time chat app, Claude Code's first attempt failed to implement working message streaming, requiring a page refresh to display new messages—a critical flaw for such an application. In contrast, GLM 5's output successfully handled real-time streaming from the first prompt. Even after providing detailed feedback to both models for iterative improvement, GLM 5 maintained its lead, producing a more functionally complete solution. This result challenges the prevailing assumption that Claude Code is the undisputed leader for complex coding tasks, suggesting GLM 5 has particular strengths in architecting stateful, real-time systems.
The developer's experience, moving from a local coding skeptic to a proponent of GLM 5 for specific use cases, highlights a significant shift in the competitive landscape. While models like Kimi K2.5 were dismissed in the same test, GLM 5's performance indicates that Zhipu AI's latest model is a serious contender. This real-world test moves beyond synthetic benchmarks, focusing on practical implementation success for a demanding, stateful application, which is a key metric for professional developers.
- GLM 5 via OpenCode beat Claude Code (Opus 4.6) on a complex real-time WebSocket chat app build.
- Claude's first attempt lacked working streaming, a critical failure, while GLM 5's code worked correctly.
- The test was run by a developer with over 12 billion tokens of prior Claude usage, adding credibility.
Why It Matters
It signals a viable, high-performance alternative to Claude for complex software engineering, potentially lowering costs and increasing options for dev teams.