Open Source

Qwen3.6 is incredible with OpenCode!

A developer reports the Qwen3.6-35B model successfully implemented complex RLS security across a multi-language codebase.

Deep Dive

A developer's viral post highlights a significant leap in local AI coding assistants, praising Alibaba's Qwen3.6-35B model. Running the 'Qwen3.6-35B-A3B' variant locally on an RTX 4090 with 24GB of VRAM via llama.cpp, the user tasked it with implementing PostgreSQL Row-Level Security (RLS) across a complex, multi-service codebase written in Rust, TypeScript, and Python. Despite initial bugs, the model successfully navigated the task, iterating based on compiler errors and proposing a major 29-file edit before the user guided it to a more elegant solution involving request-scoped database connections.

The setup used a 262,144-token context window, an IQ4_NL quantization, and specific llama.cpp server settings to handle parallel tool calls from the OpenCode interface, achieving over 100 tokens/second output. The developer concluded that while it doesn't 'one-shot' complex Rust code like Anthropic's Claude 3 Opus, its ability to reason, plan, and iterate locally marks it as the most capable open-source coding model they've tested, bringing the community closer to a 'holy grail' of self-hosted AI development.

Key Points
  • The Qwen3.6-35B model successfully planned and iterated on implementing PostgreSQL RLS security across Rust, TypeScript, and Python services.
  • Running locally with a 262k context on an RTX 4090, it used ~21GB VRAM and achieved 100+ tokens/sec output via a customized llama.cpp server.
  • The developer found it superior to other local models like Gemma 4 for iterative coding, though it still requires guidance compared to cloud models like Claude.

Why It Matters

This demonstrates a powerful, private alternative to cloud-based coding AIs, enabling complex development work without sending code to external servers.