I'm running qwen3.6-35b-a3b with 8 bit quant and 64k context thru OpenCode on my mbp m5 max 128gb and it's as good as claude
A developer reports Qwen3.6-35B, quantized to 8-bit, matches Claude's quality for complex coding tasks.
A developer's viral post highlights the impressive local performance of Alibaba's Qwen3.6-35B-A3B model. Running on a MacBook Pro M5 Max with 128GB RAM via the OpenCode interface, the model was loaded with 8-bit quantization—a technique that reduces memory usage—and utilized a 64K token context window. The user, who had tested other local models like Gemma 4, Qwen3 Coder Next, and Nemotron, found Qwen3.6's response speed and accuracy for complex, multi-step coding tasks to be on par with Anthropic's cloud-based Claude models.
The specific test involved a long research task where the model used multiple tool calls to investigate why an 'R8' tool was breaking serialization in an Android app. The model's ability to handle this intricate debugging workflow locally, without sending sensitive code to external servers, was a key benefit. The developer stated it would become their 'daily driver,' replacing their previous setup of using Kimi k2.5 via OpenCode Zen. This shift underscores a growing trend of capable, large language models (LLMs) running efficiently on consumer hardware, offering professionals a powerful and private alternative to cloud-based AI assistants.
- Qwen3.6-35B-A3B model runs locally with 8-bit quantization on an M5 Max MacBook Pro, using a 64K context window.
- User tested it on a complex Android app debugging task involving serialization issues and multiple tool calls.
- Performance was found to rival Anthropic's Claude, prompting a switch from cloud-based tools to a local daily driver.
Why It Matters
Enables high-quality, private AI coding assistance on local hardware, reducing dependency on and trust in cloud API providers.