Open Source

Alibaba's Qwen 3.7 launches with 128K context and 40% faster inference

New model beats GPT-4o on Math and Code benchmarks, free on Qwen Chat

Deep Dive

Alibaba Cloud unveiled Qwen 3.7, the latest iteration of its flagship large language model, now live on the Qwen Chat web and mobile app. The 72B-parameter model features a 128K-token context window, enabling analysis of entire research papers or lengthy codebases in a single conversation. Early benchmarking shows Qwen 3.7 scoring 92.3% on MATH-500 and 88.1% on HumanEval, surpassing OpenAI's GPT-4o on math reasoning and matching Claude 3.5 Sonnet on code generation. Alibaba also claims 40% faster inference throughput via optimized model parallelism and quantization, making it one of the fastest open-weight models in its class.

Qwen 3.7 is accessible for free on Qwen Chat with a daily rate limit of 100 messages, while a Pro subscription ($19.99/month) unlocks unlimited usage and priority GPU access. The model supports multimodal inputs (text and images) via vision encoder, though video and audio are not yet available. Developers can also download weights from Hugging Face under the Apache 2.0 license for local deployment. This release positions Alibaba as a major competitor in the open-weight AI space, challenging Meta's Llama 3 and Mistral's models with stronger reasoning and larger context.

Key Points
  • 72B parameters with 128K token context window, available on Qwen Chat and open-source
  • Outperforms GPT-4o on MATH-500 (92.3%) and matches Claude 3.5 on HumanEval (88.1%)
  • 40% faster inference than Qwen 3.0, free tier with 100 messages/day, Pro at $19.99/month

Why It Matters

Open-weight model rivaling closed-source leaders gives developers a free, high-performance alternative for reasoning and coding tasks.