You can run MiniMax-2.5 locally
Run a state-of-the-art 230B parameter model locally with just 101GB of VRAM.
Deep Dive
MiniMax-2.5 is a new open-source LLM achieving state-of-the-art performance in coding, tool use, and office tasks. The massive 230B parameter model (with 10B active) features a 200K context window. While the unquantized version requires 457GB, a new 3-bit GGUF quantization from Unsloth reduces the size to just 101GB—a 62% reduction—making it feasible to run locally on high-end consumer hardware.
Why It Matters
This dramatically lowers the barrier for developers and researchers to access and experiment with frontier-level AI capabilities on their own machines.