Viral Wire

DeepSeek Releases V4 Open-Source AI Model with 1 Million-Token Context and Mixture-of-Experts Architecture

Let's Data Science May 05, 2026

⚡1.6 trillion parameters, only 49B active, and 1M token context – for free.

Deep Dive

DeepSeek unveiled V4 on April 24, offering two open-weight MoE variants. The flagship deepseek-v4-pro packs 1.6 trillion total parameters with only 49 billion active per token, while the more efficient deepseek-v4-flash uses 284 billion total (13B active). Both default to a massive 1,000,000-token context window, enabled by DeepSeek's Sparse Attention and token-wise compression techniques. The models are immediately available via DeepSeek's public API, with pricing and throughput metrics already being compared by third parties.

Media reception is split. The New York Times frames the open release as a potential soft-power advantage for China, while MIT Technology Review emphasizes the practical gains in long-context handling and lower inference cost. The Economist, however, describes the launch as failing to match the disruptive impact of earlier DeepSeek releases. Meanwhile, NIST's Center for AI Standards and Innovation (CAISI) has already evaluated the Pro variant, signaling that standards bodies are adapting quickly to open frontier models. For practitioners, the real test will come from independent benchmarks and community stress tests on Hugging Face.

Key Points

Two MoE variants: Pro with 1.6T total/49B active parameters and Flash with 284B total/13B active
Default 1,000,000-token context window enabled by DeepSeek Sparse Attention
Open-weight release available via API; NIST's CAISI has already published an evaluation

Why It Matters

This open-weight, 1M-context model lowers barriers for startups and researchers while prompting faster safety auditing.

Read Original Article

DeepSeek Releases V4 Open-Source AI Model with 1 Million-Token Context and Mixture-of-Experts Architecture

Why It Matters

Stay Ahead in AI