OpenAI releases mini and nano variants of GPT 5.4
The new small models cost 90% less than GPT-4 and run on-device for privacy.
OpenAI has expanded its model lineup with the launch of GPT-5.4 Mini and GPT-5.4 Nano, marking a strategic push into the small language model (SLM) market. The GPT-5.4 Mini is a 7-billion parameter model built for developers needing a balance of performance and cost, priced at just $0.04 per million input tokens—a 90% reduction compared to GPT-4 Turbo. It's designed for high-volume API calls in applications like chatbots, content moderation, and data extraction where extreme reasoning isn't required. The company claims it matches the performance of models twice its size on common benchmarks, making it a compelling option for scaling AI features.
The GPT-5.4 Nano is a more radical departure as a 3B parameter model engineered for on-device and offline inference. This model is optimized to run efficiently on consumer hardware like laptops and smartphones without a constant internet connection, addressing growing demand for privacy-focused AI. Developers can now build fully local applications, from personal writing assistants to document analyzers, that process sensitive data entirely on a user's device. This release directly competes with other efficient models like Google's Gemma 2 and Meta's Llama 3.1, signaling OpenAI's commitment to capturing the entire developer stack, from massive cloud models to compact, specialized ones.
- GPT-5.4 Mini is a 7B parameter model priced at $0.04 per 1M tokens, a 90% cost cut vs. GPT-4.
- GPT-5.4 Nano is a 3B parameter model built for on-device, offline use, enabling private AI applications.
- Both models target developers building cost-effective, high-volume, or privacy-sensitive AI agents and tools.
Why It Matters
This drastically lowers the cost and expands the reach of AI, enabling a new wave of affordable and private applications.