Media & Culture

MiniMax M2.7 is on par in most aspects against GPT 5.4 & Opus 4.6 in benchmarks 🤖

r/ArtificialInteligence March 23, 2026

⚡The new model achieves competitive coding scores while being dramatically cheaper to run than top-tier rivals.

Deep Dive

MiniMax, a prominent Chinese AI lab, has released its M2.7 model, positioning it as a formidable, cost-efficient competitor to the industry's leading large language models. Benchmarks reveal the model is on par with OpenAI's GPT-5.4 and Anthropic's Claude Opus 4.6 in critical areas like coding and agentic capabilities. On the SWE Bench Pro test, M2.7 scored 56.2%, beating Google's Gemini 3.1 Pro (54.2%) and coming close to Claude Sonnet 4.6 (57.2%) and GPT-5.4 (57.7%). It also leads on the Multi-SWE Bench with a score of 52.7%. For agentic tasks—where AI can use tools and take actions—it scored 62.7% on MM-ClawBench, remaining competitive with more expensive models.

The most disruptive aspect of M2.7 is its staggering cost efficiency. While delivering comparable performance, it is priced at a fraction of its rivals. Output tokens cost $1.2 per million, which is 20.8 times cheaper than Claude Opus 4.6's $25 per million. Input tokens are 16.7 times cheaper. This price-performance ratio challenges the prevailing market dynamics, where top-tier capability has commanded a premium. The main trade-off is a context window nearly 5x smaller than Opus 4.6's, but for many cost-sensitive development and agent deployment scenarios, M2.7 presents a compelling value proposition.

A notable technical achievement highlighted by MiniMax is that M2.7 is the first model to have "deeply participated in its own self-evolution." This suggests the company used advanced reinforcement learning (RL) training loops where the AI assisted in optimizing its own architecture and training process, a cutting-edge approach in model development. This could explain its efficiency gains. The model's strong showing, particularly in coding intelligence benchmarks, establishes MiniMax as a serious player offering a viable alternative for developers and companies looking to deploy capable AI agents without the high operational costs of market leaders.

Key Points

Competes with top models: Scores 56.2% on SWE Bench Pro, near GPT-5.4 (57.7%) and Claude Opus 4.6 (57.3%).
Unmatched cost efficiency: Output tokens are 20.8x cheaper than Claude Opus 4.6, at $1.2 vs. $25 per million.
Self-evolved architecture: First model to use its own AI in RL training loops for self-optimization during development.

Why It Matters

It dramatically lowers the cost of deploying high-performance AI for coding and autonomous agents, increasing accessibility and competition.

Read Original Article

MiniMax M2.7 is on par in most aspects against GPT 5.4 & Opus 4.6 in benchmarks 🤖

Why It Matters

Stay Ahead in AI