v4.8
Redesigned composer, smooth scroll, and Electron improvements for local LLM power users.
oobabooga, the creator of the popular text-generation-webui, has released v4.8 of the now desktop-native textgen application. The star project (47k GitHub stars) focuses on making local LLM inference accessible through a polished graphical interface. Version 4.8 introduces a completely redesigned chat composer — a taller input area with paperclip and action buttons pinned to the bottom, closely mimicking the UI patterns of Gemini and DeepSeek. Additional UI refinements include smooth scroll animation when sending messages, more breathing room for action buttons on the last message, and a disabled spellcheck in the chat input to avoid interference with model output.
On the technical side, the Electron wrapper now persists window bounds and maximized state across launches. Users who prefer a browser-only experience can use the new --no-electron flag to skip the desktop window entirely. The release also fixes several bugs: missing log colors on Windows, big character picture loading failures, speculative decoding breakage due to llama.cpp arg renames, and truncation length reverting after model load. Dependencies were updated to the latest llama.cpp (commit 68380ae) and ik_llama.cpp (commit 9a26522). Portable builds now cover NVIDIA CUDA 12.4/13.1, AMD/Intel Vulkan, AMD ROCm 7.2, and CPU-only variants for Windows and Linux, plus Apple Silicon and Intel macOS builds. The ik_llama.cpp fork (with new quant types) is also offered as an alternative engine for NVIDIA and CPU platforms.
- Redesigned chat composer: taller input with pinned actions, inspired by Gemini and DeepSeek UIs.
- Electron improvements: window bounds persistence, new --no-electron flag for browser-only use.
- Updated dependencies: llama.cpp to ggml-org/llama.cpp@68380ae and ik_llama.cpp to ikawrakow/ik_llama.cpp@9a26522.
Why It Matters
Polished desktop experience makes running powerful LLMs locally as easy as using ChatGPT, with no cloud dependency.