Open Source

Qwen 3.6 local quant rivals frontier models in HTML canvas coding test

Local 27B model beats some cloud APIs on a complex single-file animation task

Deep Dive

A Reddit user (u/Fragrant-Remove-9031) ran a rigorous comparison of local Qwen 3.6 variants against frontier models from Perplexity on a dense coding primitive: generating a single-file HTML canvas animation of a car driving with parallax scrolling, spinning wheels, and cinematic lighting—all in vanilla JavaScript with no libraries. Frontier models tested included Claude sonnet 4.6 Thinking, Gemini 3.1 Pro Thinking, GPT 5.4 Thinking, and Kimi k2.6 Thinking. Locally, they ran Qwen3.5 4B/9B, Qwen3.6-27B (both base and Claude-opus-reasoning-distilled), Qwen3.6-35B A3B, and Gemma-4-31b-it on a Ryzen 5 5600 with 24GB DDR4 and an RX 5700 XT (8GB). Token speeds ranged from 1.91 tok/s (Gemma) to 80 tok/s (Qwen3.5 4B).

Surprisingly, the local Qwen3.6-27B Q4_K_M (2.65 tok/s) ranked second overall behind Kimi k2.6 Thinking for visual quality—delivering natural parallax, realistic wheel rotation, and smooth road feel that outperformed several larger cloud models. The user noted that frontier models did not universally dominate, suggesting that for specific visual coding tasks, quantized local models can hold their own. The test underscores the growing viability of local AI for production-like coding work, especially when latency is less critical than output fidelity.

Key Points
  • Kimi k2.6 Thinking won overall for visual quality and realism in the animation
  • Local Qwen3.6-27B Q4_K_M at 2.65 tok/s ranked second, beating models like GPT 5.4 and Claude 4.6
  • Lowest token speed (Gemma-4-31b-it at 1.91 tok/s) still produced usable results, showing local viability

Why It Matters

Local AI models are closing the gap with cloud APIs for specialized coding tasks, reducing dependency on expensive subscriptions.