Open Source

Qwen3.6 35B A3B Heretic (KLD 0.0015!) Incredible model. Best 35B I have found!

r/LocalLLaMA April 26, 2026

⚡This 35B model fits in 24GB VRAM and outperforms the original on benchmarks.

Deep Dive

A new community fine-tune of Qwen3.6 35B, dubbed A3B Heretic, has emerged as the top uncensored model in its class, according to early adopter reports. With a KLD (Kullback-Leibler divergence) of just 0.0015, it maintains close alignment to the original model on harmless prompts while pushing boundaries on creative and role-playing tasks. The model is optimized for consumer hardware: at IQ4XS quantization with Q8 KV cache and 262K context, it fits comfortably in 24GB VRAM and reliably executes multi-turn tool calls without failure.

Early benchmarks show the A3B Heretic outperforming the original Qwen3.6 35B on the UGI NatInt section, echoing similar gains seen in llmfan's earlier 3.5 35B fine-tune. Users report the model feels subjectively smarter than the base version, likely due to curated training data that enhances reasoning and coherence. For professionals running local LLMs for research, content generation, or agentic workflows, this model offers a compelling balance of uncensored capability and hardware efficiency.

Key Points

A3B Heretic fine-tune of Qwen3.6 35B achieves KLD 0.0015, staying close to original on harmless prompts
Runs at IQ4XS + Q8 KV cache with 262K context in 24GB VRAM, supporting multi-turn tool calls
Outperforms original Qwen3.6 35B on UGI NatInt benchmarks, with users reporting superior reasoning

Why It Matters

This uncensored fine-tune offers enterprise-grade reasoning on consumer hardware, enabling local AI workflows without cloud costs.

Read Original Article

Qwen3.6 35B A3B Heretic (KLD 0.0015!) Incredible model. Best 35B I have found!

Why It Matters

Stay Ahead in AI