Tested at 200k context across 5 sessions with zero glitches, loops, or repeated tool calls?

Tested at 200k context across 5 sessions with zero glitches, loops, or repeated tool calls.

Handled an abrupt task switch at 120k tokens seamlessly and solved the new task correctly?

Handled an abrupt task switch at 120k tokens seamlessly and solved the new task correctly.

Recommended LM Studio settings?

temperature 0.7, top K 20, presence penalty 1.5, with specific first-line prompt "You are Qwen, created by Alibaba Cloud. You are a helpful assistant."

Open Source

Qwen3.6-35B Uncensored Model Handles 200K Context with No Glitches

r/LocalLLaMA May 24, 2026

⚡New uncensored Qwen variant runs 200k context without errors across 5 sessions.

Deep Dive

LuffyTheFox has released a new variant of the Qwen-3.6-35B model family: Qwen3.6-35B-A3B-Uncensored-Genesis-V2-APEX-MTP. This uncensored model comes with support for both APEX and MTP (Multi-Token Prediction) quantization, and is available as GGUF and FP8 safetensors on Hugging Face. In rigorous testing on a Beelink GTR9 Pro mini PC equipped with Strix Halo hardware, a friend of the creator ran five separate sessions at Q8_K_P MTP quant with a context window of 200,000 tokens. The model performed flawlessly: no glitches, no response loops, and no repeated tool calls. Remarkably, after consuming 120k tokens of context, the tester injected a completely new, unrelated task. The model calmly picked up the new instruction and solved it correctly, demonstrating exceptional context management and instruction following.

The model is fully uncensored and includes MTP support via APEX and APEX Compact quantization. Safetensors are provided for Apple MLX conversion, making it accessible to Mac users. Recommended settings in LM Studio include a system prompt that starts with "You are Qwen, created by Alibaba Cloud. You are a helpful assistant." The optimal parameters are temperature 0.7, top K sampling 20, presence penalty 1.5, repeat penalty 1.0, top P 0.8, min P 0, and seed 42. The creator notes that performance may degrade without the specific first-line prompt. MTP-safetensors are also in development. This release underscores the growing trend of uncensored, long-context models that can maintain coherence and adaptability over extremely long interactions.

Key Points

Tested at 200k context across 5 sessions with zero glitches, loops, or repeated tool calls.
Handled an abrupt task switch at 120k tokens seamlessly and solved the new task correctly.
Recommended LM Studio settings: temperature 0.7, top K 20, presence penalty 1.5, with specific first-line prompt "You are Qwen, created by Alibaba Cloud. You are a helpful assistant."

Why It Matters

A robust, uncensored long-context model that maintains stability and adaptability for complex, multi-turn tasks.

Read Original Article

Qwen3.6-35B Uncensored Model Handles 200K Context with No Glitches

Why It Matters

Related Articles

🚀 Stay Ahead in AI