Viral Wire

Mistral AI Introduces Mistral Small 4, Unifying Reasoning, Multimodal, and Agentic Coding Capabilities

Mistral AI April 25, 2026

⚡One model now handles reasoning, vision, and coding agents with 3x throughput

Deep Dive

Mistral AI today announced Mistral Small 4, a hybrid Mixture-of-Experts (MoE) model that unifies the capabilities of three previous flagship models: Magistral (reasoning), Pixtral (multimodal), and Devstral (agentic coding). With 119B total parameters and only 6B active per token (8B including embeddings), the model uses 128 experts with 4 active per token, enabling efficient scaling and specialization. It supports a 256K context window for long-form interactions and native text and image inputs, making it suitable for tasks from document parsing to visual analysis.

A key innovation is the configurable reasoning_effort parameter, letting users toggle between fast, low-latency responses (reasoning_effort="none") and deep, step-by-step reasoning (reasoning_effort="high"). Performance improvements include a 40% reduction in end-to-end completion time and 3x more requests per second compared to Mistral Small 3. On benchmarks like AIME 2025 and LiveCodeBench, Mistral Small 4 matches or surpasses GPT-OSS 120B while generating 20-50% shorter outputs, reducing latency and inference costs. The model is released under Apache 2.0 and optimized for deployment on 4x NVIDIA HGX H100 or 2x DGX B200, with support for vLLM, SGLang, llama.cpp, and Transformers.

Key Points

119B total parameters, 6B active per token, with 128 MoE experts (4 active per token)
Configurable reasoning effort: toggle between fast chat and deep reasoning with a single parameter
3x more requests per second and 40% lower latency vs Mistral Small 3, beating GPT-OSS 120B on benchmarks

Why It Matters

One open-source model now replaces three specialized ones, cutting infrastructure costs and latency for enterprises.

Read Original Article

Mistral AI Introduces Mistral Small 4, Unifying Reasoning, Multimodal, and Agentic Coding Capabilities

Why It Matters

Stay Ahead in AI