ggufy: One-click image model quantization for GPU-constrained users
A drag-and-drop tool that quantizes image models in minutes, not gigabytes.
Frustrated by the complexity and high RAM demands of existing image model quantization tools, developer qskousen built ggufy. Written in Zig for efficiency, ggufy is a cross-platform utility that provides both a command-line interface and a user-friendly GUI where users can drag and drop files. It compiles into single-file executables for Linux, Windows, and macOS (both ARM64 and x86), with pre-built binaries available on GitHub Releases.
ggufy supports converting to and from GGUF and safetensors formats, covering a wide range of datatypes including q3_k through q8_0, f32, bf16, f16, f8_e4m3, f8_e5m2, scaled fp8, mxfp8, and nvfp4. The tool is memory-efficient and can quantize ZiT in about 1.5 minutes on the developer's machine. While SDNQ is not yet supported, qskousen plans to add it once the format is figured out. The tool is designed to be extensible, with users encouraged to request additional models or features via GitHub issues.
- Single-file executables for Linux, Windows, and macOS (ARM64/x86) with pre-built binaries
- Supports 10+ datatypes including q8_0, f8_e4m3, and nvfp4 for flexible quantization
- Quantizes ZiT in ~1.5 minutes with low RAM usage
Why It Matters
ggufy lowers the barrier to image model compression, making it practical for developers on modest hardware.