v4.3.3 - Gemma 4 support!
The popular local AI UI now supports Google's latest Gemma 4 model, adding tool-calling capabilities and cutting UI latency by 50ms.
Oobabooga has launched version 4.3.3 of the popular text-generation-webui, bringing official support for Google's latest Gemma 4 model. This update enables users to run Gemma 4 locally with full tool-calling capabilities through both the API and web interface. The release also introduces ik_llama.cpp as a new backend option, featuring improved quantization accuracy through Hadamard KV cache rotation and optimizations for MoE models and CPU inference.
Significant API enhancements include adding echo and logprobs parameters to the /v1/completions endpoint, providing token-level probabilities for both prompt and generated tokens. The UI received performance boosts with custom Gradio optimizations that save up to 50ms per event like button clicks. Security improvements address SSRF vulnerabilities in extensions and add server-side validation for form components.
The update includes multiple bug fixes and dependency updates, including llama.cpp for Gemma 4 compatibility and ExLlamaV3 to version 0.0.28. Portable builds now offer both standard llama.cpp and ik_llama.cpp variants across Windows, Linux, and macOS platforms. This release solidifies text-generation-webui's position as a comprehensive solution for running cutting-edge open models like Gemma 4 alongside established architectures.
- Adds official Gemma 4 support with tool-calling capabilities in API and UI
- Introduces ik_llama.cpp backend with improved KV cache quantization and MoE optimizations
- Optimizes UI performance by saving 50ms per event and adds token-level logprobs to API
Why It Matters
Enables developers to run Google's latest Gemma 4 model locally with full tool-calling, expanding accessible AI capabilities.