Open Source

PSA: Having issues with Qwen3.5 overthinking? Give it a tool, and it can help dramatically.

r/LocalLLaMA April 14, 2026

⚡Users discover enabling tools changes Qwen 3.5's verbose reasoning to Claude-like concise traces.

Deep Dive

A community-driven solution has emerged for users frustrated with Alibaba's Qwen 3.5 model exhibiting verbose, over-elaborate reasoning. The core issue stems from the model's default behavior when no tools are available, causing it to engage in lengthy, Gemini-like reasoning traces with bullet-pointed internal monologues. This 'overthinking' slows down responses and clutters outputs, a problem persisting since the model's release two months ago.

The fix is surprisingly simple: enable the model's native function calling tools. When tools are activated—even if they're not used—Qwen 3.5 completely switches its reasoning style. It abandons the verbose trace for short, natural, Claude-like reasoning, producing faster and more direct responses. This works across interfaces like Open-WebUI (using 'native' function calling), OpenCode, or Hermes Agent. For optimal results, users should also adjust the `presence_penalty` sampling parameter to between 1.0 and 1.5, as recommended in Unsloth's tuning guides.

Key Points

Enabling tools shifts Qwen 3.5 from verbose Gemini-style reasoning to concise Claude-like traces
Adjust the `presence_penalty` parameter to 1.0-1.5 alongside tool activation for best results
The fix works in Open-WebUI, OpenCode, and Hermes Agent by using native function calling

Why It Matters

This simple tweak lets developers get faster, cleaner outputs from a leading open-source model, improving workflow efficiency.

Read Original Article

PSA: Having issues with Qwen3.5 overthinking? Give it a tool, and it can help dramatically.

Why It Matters

Stay Ahead in AI