Open Source

DeepSeek releases V4 Pro DSpark: faster inference with sparse attention

New open-source model achieves 3x speedup using dynamic sparse computation...

Deep Dive

DeepSeek AI has released DeepSeek-V4-Pro-DSpark, an open-source large language model that uses a novel dynamic sparse attention mechanism called DSpark. The model is available on HuggingFace alongside a research paper.

Key Points
  • DeepSeek-V4-Pro-DSpark uses a dynamic sparse attention mechanism to prune 70% of attention heads during inference without accuracy loss.
  • Achieves 2.5–3x faster inference on long contexts (32K-128K tokens) compared to the dense base model V4 Pro.
  • Open-source release on HuggingFace with full paper and code; can run on a single A100 GPU with quantization.

Why It Matters

Makes frontier-class 400B+ models feasible on consumer hardware, drastically lowering inference costs for developers.

📬 Get the top 10 AI stories daily