Open Source

Qwen3.5-4B Uncensored Aggressive Release (GGUF)

r/LocalLLaMA March 03, 2026

⚡New 4B parameter model answers everything with zero capability loss, available in multiple GGUF quantizations.

Deep Dive

Independent AI developer HauhauCS has released an aggressively uncensored version of Qwen's recently launched Qwen3.5-4B model, creating what appears to be the first completely refusal-free variant of this new architecture. The model, called Qwen3.5-4B-Uncensored-HauhauCS-Aggressive, achieved 0 refusals out of 465 test cases while maintaining the original model's full capabilities—a significant achievement in the uncensored model space where capability loss is common. This release comes just as Qwen introduced their new small model family, featuring a hybrid Gated DeltaNet linear attention architecture with full softmax attention in a 3:1 ratio.

The technical specifications include 4 billion dense parameters, 32 layers, and native 262K context length with multimodal capabilities (text, image, video). Available in multiple GGUF quantizations (Q4_K_M at 2.6GB to BF16 at 7.9GB), the model runs on llama.cpp, LM Studio, Jan, and koboldcpp. HauhauCS employed a 'lossless uncensoring' approach without dataset changes, though some responses include small disclaimers baked into the base training. The developer is already working on uncensored versions of the larger Qwen3.5-9B, 27B, and 35B models, suggesting this could become a comprehensive uncensored family for local deployment.

Key Points

Achieves 0 refusals out of 465 test cases with no capability loss
4B parameters, 32 layers, 262K native context, multimodal architecture
Available in multiple GGUF quantizations from 2.6GB to 7.9GB for local deployment

Why It Matters

Enables completely unrestricted AI applications locally without censorship limitations, ideal for creative and research use cases.

Read Original Article

Qwen3.5-4B Uncensored Aggressive Release (GGUF)

Why It Matters

Stay Ahead in AI