Open Source

Gemma 4 has been released

Google's new open-source AI family handles text, images, and audio with a 256K context window.

Deep Dive

Google DeepMind has launched Gemma 4, a significant evolution of its open-source AI model family. This release is notable for its multimodal capabilities, handling text and image inputs across all models, with native audio and video support on the smaller E2B and E4B variants. The models are available in four distinct sizes—E2B, E4B, 26B A4B, and 31B—and feature both Dense and Mixture-of-Experts (MoE) architectures. This design allows for scalable deployment, from efficient execution on mobile devices and laptops using the smaller models to more powerful reasoning on servers with the larger ones. A key technical advancement is the extended context window, reaching up to 256K tokens for medium-sized models, facilitated by a hybrid attention mechanism that balances local and global processing for speed and deep contextual awareness.

Architecturally, Gemma 4 introduces configurable reasoning modes and native support for system prompts, enabling more structured and controllable AI conversations. The models show marked improvements in coding benchmarks and come with native function-calling support, making them well-suited for building autonomous AI agents. For professionals and developers, this release democratizes access to frontier-level AI by providing open-weights models that are optimized for on-device use. The combination of multimodal understanding, long-context processing, and efficient architectures positions Gemma 4 as a versatile toolkit for applications in reasoning, agentic workflows, and multilingual tasks across over 140 languages.

Key Points
  • Multimodal & Efficient: Processes text, images, video, and audio (on small models) with hybrid attention for on-device use from phones to servers.
  • Scalable Architectures: Four model sizes (E2B to 31B) with both Dense and Mixture-of-Experts (MoE) variants, supporting context windows up to 256K tokens.
  • Enhanced for Agents: Features native function-calling, system prompt support, and improved coding performance for building capable autonomous AI workflows.

Why It Matters

Delivers open-source, state-of-the-art multimodal AI that professionals can run and customize locally, reducing reliance on cloud APIs.