Open Source

MiniMax m2.7 (mac only) 63gb: 88% and 89gb: 95%, MMLU 200q

A leaked 89GB version scores 95% on MMLU, nearing Anthropic's Claude Sonnet 3.5 performance.

Deep Dive

A high-performance AI model from Chinese company MiniMax, dubbed M2.7, has been leaked and shared on the Hugging Face platform by a user named JANGQ-AI. The leak includes two quantized versions: a 63GB model and a more powerful 89GB variant. Benchmark results are impressive, with the smaller model achieving an 88% score and the larger one hitting 95% on the Massive Multitask Language Understanding (MMLU) benchmark, a standard test for reasoning and knowledge. This 95% score places it in the upper echelon of current models, drawing direct comparisons to Anthropic's Claude Sonnet 3.5.

Early performance tests, specifically on Apple's M-series chips, show promising efficiency. Users report the model can generate approximately 50 tokens per second on a high-end M5 Max Mac, with a context processing speed (prompt processing) of around 400 tokens per second. This combination of high benchmark scores and fast local inference speeds on consumer hardware is significant. It indicates a trend where near-state-of-the-art AI capabilities are becoming accessible to run locally, reducing reliance on cloud APIs and associated costs.

The model's architecture details remain unofficial, but its performance profile suggests MiniMax is making substantial strides in model efficiency and capability. The community reaction highlights the desire for a 'Sonnet 3.5 at home'—a locally runnable model that matches the performance of leading closed-source alternatives. This leak provides a tangible, testable point of comparison for the rapid progress being made by AI labs outside the US-centric Big Tech arena, particularly in optimizing models for specific hardware like Apple Silicon.

Key Points
  • The 89GB version scores 95% on MMLU, rivaling top models like Claude Sonnet 3.5.
  • Runs efficiently on Apple Silicon, hitting ~50 tokens/sec on an M5 Max chip.
  • The model leak demonstrates rapid progress in making high-performance AI locally runnable.

Why It Matters

It brings near-top-tier AI performance to local machines, offering an alternative to costly cloud APIs and enhancing privacy.