Open Source

Ternary Bonsai: Top intelligence at 1.58 bits

r/LocalLLaMA April 17, 2026

⚡New 1.58-bit AI models achieve a 9x memory reduction while outperforming peers on standard benchmarks.

Deep Dive

PrismML has introduced Ternary Bonsai, a groundbreaking family of language models that pushes the frontier of AI efficiency. These models, available in 8B, 4B, and 1.7B parameter sizes, utilize a novel 1.58-bit quantization scheme with ternary weights {-1, 0, +1}. This approach results in a memory footprint approximately 9x smaller than conventional 16-bit models, a critical advancement for deploying capable AI on edge devices, smartphones, and other hardware with strict memory constraints.

Building on the company's earlier 1-bit Bonsai models, Ternary Bonsai represents a strategic pivot on the efficiency curve, trading a modest increase in model size for a significant gain in performance. Initial benchmarks indicate these compact models outperform most peers in their respective parameter classes, challenging the assumption that high compression necessitates a major sacrifice in accuracy. The models are released in FP16 safetensors on Hugging Face for compatibility with standard tooling, with a more efficient packed MLX 2-bit format also available, signaling a move towards specialized, hardware-optimized deployment.

The release targets a clear market need for powerful yet efficient models that can run locally without constant cloud connectivity. While the current sizes are impressive, the AI community is already anticipating larger 20-40B parameter versions, which could fundamentally alter the landscape for 'large' models by making them drastically more portable and cost-effective to run.

Key Points

Uses 1.58-bit ternary weights {-1, 0, +1} for a memory footprint ~9x smaller than 16-bit models.
Available in three sizes (8B, 4B, 1.7B parameters) and outperforms most peers in its class on benchmarks.
Released in FP16 safetensors on Hugging Face with a packed MLX 2-bit format for efficient deployment.

Why It Matters

Enables powerful AI to run on smartphones and edge devices, reducing reliance on cloud infrastructure and associated costs.

Read Original Article

Ternary Bonsai: Top intelligence at 1.58 bits

Why It Matters

Stay Ahead in AI