456 billion parameters with mixture-of-experts architecture?

456 billion parameters with mixture-of-experts architecture

4 million token context window – 8x longer than GPT-4 Turbo?

4 million token context window – 8x longer than GPT-4 Turbo

Outperforms GPT-4 on reasoning and long-document benchmarks at 80% lower cost?

Outperforms GPT-4 on reasoning and long-document benchmarks at 80% lower cost

Media & Culture

Minimax M3 model breaks records with 4M token context

r/Singularity June 01, 2026

⚡New Chinese AI model boasts 456B parameters and beats GPT-4 on benchmarks…

Deep Dive

Minimax, a leading Chinese AI company, has officially launched M3, a groundbreaking multimodal model that pushes the boundaries of context length and reasoning. With 456 billion parameters and a 4 million token context window (about 8x longer than GPT-4 Turbo), M3 is designed to handle entire books, code repositories, or hours of video content in one pass. The model uses a mixture-of-experts (MoE) architecture, activating only a fraction of its parameters per query to balance performance and efficiency. Early benchmarks show M3 surpassing GPT-4 on tasks like long-document QA, mathematical reasoning, and code generation.

Beyond raw capability, M3 is optimized for real-world deployment. Minimax claims the model costs 80% less to run than comparable proprietary models, making enterprise-scale AI more accessible. The multimodal support extends to images, audio, and video, allowing for complex cross-modal analysis. While not fully open-source, Minimax is offering API access with competitive pricing. This launch signals a major step forward for Chinese AI labs, directly challenging Western models in both performance and cost. For developers, M3 unlocks new possibilities in legal document analysis, codebase review, and video understanding without the need for complex chunking strategies.

Key Points

456 billion parameters with mixture-of-experts architecture
4 million token context window – 8x longer than GPT-4 Turbo
Outperforms GPT-4 on reasoning and long-document benchmarks at 80% lower cost

Why It Matters

Enables processing of entire books or codebases in one query, slashing costs and unlocking new enterprise AI applications.

Read Original Article

Minimax M3 model breaks records with 4M token context

Why It Matters

Related Articles

🚀 Stay Ahead in AI