Ran Qwen 3.5 9B on M1 Pro (16GB) as an actual agent, not just a chat demo. Honest results.
A developer ran real automation tasks locally, proving small models can handle practical agent work.
Developer Joozio conducted a practical experiment running Alibaba's Qwen 3.5 9B model as a functional agent on a consumer 16GB M1 Pro MacBook, moving beyond simple chat demos. Using Ollama to provide an OpenAI-compatible API, he integrated the local model into his existing Claude Code-based automation system with a one-line configuration change. The test involved executing real tasks from his personal queue, focusing on whether a smaller, local model could handle a meaningful subset of agentic work without relying on cloud APIs like Claude Opus. The core finding is a shift in perspective: local AI isn't about competing with frontier models, but about identifying which agent tasks—often simple file reading, formatting, and routing—can run efficiently and privately on-device.
The Qwen 3.5 9B model performed reasonably well on straightforward agent functions: memory recall (reading and surfacing context from structured files) worked correctly, and basic tool calling was reliable for simple requests. While a noticeable gap existed in creative and complex reasoning compared to larger models, the speed was acceptable for the targeted use case. Notably, Joozio also highlighted the frontier of on-device AI by running even smaller Qwen models (0.8B and 2B) entirely offline on an iPhone 17 Pro using the open-source PocketPal AI app. This experiment validates a growing trend: leveraging cost-effective, private local inference for specific automation pipelines, reducing dependency on per-token cloud costs and data transmission for suitable tasks.
- Qwen 3.5 9B ran real automation tasks on a 16GB M1 Pro MacBook via Ollama's OpenAI-compatible API.
- The model successfully handled memory recall and basic tool calling, proving viable for simple agent work.
- Tiny models (Qwen 0.8B/2B) also ran fully offline on an iPhone 17 Pro, showcasing on-device AI maturity.
Why It Matters
Enables cost-effective, private automation on consumer hardware, reducing reliance on cloud APIs for simple agent tasks.