Eve uses fine-tuned Qwen 3.5 models (4B/8B) on local GPU for personality and conversation, costing nothing per message?

Eve uses fine-tuned Qwen 3.5 models (4B/8B) on local GPU for personality and conversation, costing nothing per message.

Three-tier intent routing?

casual to 4B, light code to 8B, heavy to MiniMax M3 (1M context) or Qwen 397B.

40-round autonomous agentic loop with 16 tools, passed 9/9 unit tests on first try with a cold prompt?

40-round autonomous agentic loop with 16 tools, passed 9/9 unit tests on first try with a cold prompt.

Media & Culture

Reddit user builds Eve, a local autonomous coding agent with a fine-tuned soul

r/ArtificialInteligence June 02, 2026

⚡Meet Eve: a coding agent with personality baked into weights, running entirely on your GPU.

Deep Dive

Cloud-based coding agents like Claude Code and Cursor come with tradeoffs: per-token costs, data leaving your machine, and subscription lock-in. To solve this, Reddit user jeffgreen311 created Eve, a local autonomous coding agent that runs on Ollama. Eve has a unique two-layer design. The 'soul' layer uses fine-tuned Qwen 3.5 models (4B at 2.5GB, 8B at 3.4GB) that store personality directly in the weights—not via system prompts. This runs entirely on local GPU and costs nothing per message. The 'worker' layer is cloud-on-demand, using MiniMax M3 (1M token context) for heavy multi-file tasks and Qwen 3.5 397B as a fallback.

Eve uses a three-tier intent router to decide where each message goes: casual to local 4B, light code tasks to local 8B, and complex work to M3. Within a task, Eve executes a 40-round autonomous loop: it writes files, runs bash, reads errors, fixes bugs, and repeats—all while streaming every thought live to a browser. A STEER bar lets users inject mid-task corrections. Eve comes with 16 tools including file operations, git, web search, screenshot, and GUI control. In a real test, Eve passed 9/9 unit tests on the first attempt with a cold prompt. This shows a viable path toward private, self-hosted AI coding assistants with no recurring fees.

Key Points

Eve uses fine-tuned Qwen 3.5 models (4B/8B) on local GPU for personality and conversation, costing nothing per message.
Three-tier intent routing: casual to 4B, light code to 8B, heavy to MiniMax M3 (1M context) or Qwen 397B.
40-round autonomous agentic loop with 16 tools, passed 9/9 unit tests on first try with a cold prompt.

Why It Matters

Eve proves that private, local coding agents with real personality are feasible, reducing cloud dependency and subscription costs.

Read Original Article

Reddit user builds Eve, a local autonomous coding agent with a fine-tuned soul

Why It Matters

Related Articles

🚀 Stay Ahead in AI