Open Source

Qwen3.5 feels ready for production use - Never been this excited

Developer tests show Qwen3.5-35B nails real client projects, potentially replacing expensive API subscriptions.

Deep Dive

A developer's extensive testing of Alibaba's Qwen3.5-35B-A3B-UD-Q6_K_XL model reveals surprisingly production-ready coding capabilities that could disrupt the current AI development landscape. The model, running at 47.71 tokens/second across two GPUs (and hitting 80tps on a single GPU), successfully handled five real-world JavaScript, Go, and Rust client projects with only minor "5-minute tweak" bugs remaining. The tester compares the experience directly to using Anthropic's Claude Sonnet 4, noting particularly strong performance across the JavaScript ecosystem. This breakthrough suggests local models may have reached a tipping point where they can handle substantial professional development work previously reserved for expensive API-based models like Claude.

The implications are significant for developer economics and hardware investment decisions. The tester, who has spent $2,000 on Claude Pro Max subscriptions since June 2025 and projects $6,800 in total subscription costs through 2027, is now seriously considering investing in an RTX 6000 Pro workstation instead. This represents a potential shift toward a hybrid model where developers use APIs only for state-of-the-art spec generation and code reviews, while running most actual coding work locally. As local models like Qwen3.5 continue improving at this pace, the $200/month per developer subscription model faces increasing pressure, potentially accelerating the trend toward local AI deployment in professional development environments.

Key Points
  • Qwen3.5-35B achieved 47.71 tokens/sec on two GPUs (80tps on one) while successfully completing five real client projects
  • The model performed comparably to Claude Sonnet 4 across JavaScript, Go, and Rust projects with only minor bugs requiring quick fixes
  • Developer calculates potential $6,800 Claude subscription savings through 2027, making local hardware investment like RTX 6000 Pro economically viable

Why It Matters

Local AI models now compete with expensive API subscriptions, potentially saving developers thousands annually while offering greater control.