Qwen3.6 35B A3B Heretic (KLD 0.0015!) Incredible model. Best 35B I have found!
This 35B model fits in 24GB VRAM and outperforms the original on benchmarks.
A new community fine-tune of Qwen3.6 35B, dubbed A3B Heretic, has emerged as the top uncensored model in its class, according to early adopter reports. With a KLD (Kullback-Leibler divergence) of just 0.0015, it maintains close alignment to the original model on harmless prompts while pushing boundaries on creative and role-playing tasks. The model is optimized for consumer hardware: at IQ4XS quantization with Q8 KV cache and 262K context, it fits comfortably in 24GB VRAM and reliably executes multi-turn tool calls without failure.
Early benchmarks show the A3B Heretic outperforming the original Qwen3.6 35B on the UGI NatInt section, echoing similar gains seen in llmfan's earlier 3.5 35B fine-tune. Users report the model feels subjectively smarter than the base version, likely due to curated training data that enhances reasoning and coherence. For professionals running local LLMs for research, content generation, or agentic workflows, this model offers a compelling balance of uncensored capability and hardware efficiency.
- A3B Heretic fine-tune of Qwen3.6 35B achieves KLD 0.0015, staying close to original on harmless prompts
- Runs at IQ4XS + Q8 KV cache with 262K context in 24GB VRAM, supporting multi-turn tool calls
- Outperforms original Qwen3.6 35B on UGI NatInt benchmarks, with users reporting superior reasoning
Why It Matters
This uncensored fine-tune offers enterprise-grade reasoning on consumer hardware, enabling local AI workflows without cloud costs.