Open Source

Gemma 4 E4B + E2B Uncensored (Aggressive) — GGUF + K_P Quants (Multimodal: Vision, Video, Audio)

r/LocalLLaMA April 03, 2026

⚡Two aggressive uncensored variants of Google's Gemma 4, the 4B and 2B, now handle text, image, video, and audio with no restrictions.

Deep Dive

Independent AI developer HauhauCS has launched the first uncensored versions of Google's recently announced Gemma 4 models. Dubbed 'Aggressive' variants, the Gemma 4 E4B (4 billion parameters) and E2B (2 billion parameters) are engineered to have zero refusals, stripping away the built-in safety filters while purportedly maintaining 100% of the original model's capabilities. This release directly challenges the increasing use of generative reward models (GenRM) by companies like Google and NVIDIA, which act as internal critics to enforce content policies.

Both models are natively multimodal, capable of processing text, images, video, and audio within a single architecture, and feature a 131K token context window. They are released in the GGUF format with specialized 'K_P' quantizations, which use model-specific analysis to preserve quality, offering performance akin to 1-2 levels higher quantization at only a 5-15% larger file size. The packages include the necessary 'mmproj' file for vision/audio support and are compatible with popular local inference tools like llama.cpp and LM Studio.

The creator notes these smaller models were a straightforward uncensoring process, but warns that the upcoming, larger Gemma 4 E31B (dense) and E26B-A4B (Mixture-of-Experts) variants will require more intricate work. A disclaimer clarifies that while extensive testing shows 0 refusals, the complex nature of modern censorship techniques means edge cases might exist, though they are expected to be negligible for most users. This release caters to a niche demanding completely unrestricted AI for research, development, or specific applications where content filtering is a barrier.

Key Points

Zero-Refusal Design: The 'Aggressive' tag means the models are engineered for 0/465 refusals, removing all built-in content safety filters from Google's original release.
Native Multimodality: A single model architecture processes text, images, video, and audio, supported by an included mmproj file for vision/audio tasks.
Optimized for Local Use: Released in multiple GGUF quantization formats (Q8_K_P to Q2_K_P) with efficient K_P quants, ensuring compatibility with llama.cpp and LM Studio for local deployment.

Why It Matters

This release provides developers and researchers with fully unrestricted, multimodal AI models for local use, bypassing corporate content policies for specialized applications.

Read Original Article

Gemma 4 E4B + E2B Uncensored (Aggressive) — GGUF + K_P Quants (Multimodal: Vision, Video, Audio)

Why It Matters

Stay Ahead in AI