Open Source

Local manga translator with LLM build-in, written in Rust with llama.cpp integration

Open-source tool combines object detection, OCR, and llama.cpp for one-click manga translation.

Deep Dive

Developer mayocream has launched Koharu, a powerful open-source tool designed to automate the complex process of translating manga and other images. Written in Rust for performance, it integrates the llama.cpp inference engine, allowing it to run capable LLMs like Google's Gemma 4 family and Alibaba's Qwen3.5 family locally. The application stitches together a sophisticated pipeline: it first detects text objects and analyzes page layout, then uses a visual LLM for Optical Character Recognition (OCR), and finally employs fine-tuned inpainting models to seamlessly place the new translation back into the image. The result is a highly performant, one-click translation workflow that handles the entire process from image input to typeset output.

Beyond automation, Koharu includes a built-in editor that lets users proofread translations and adjust typography—changing font, size, and color—effectively functioning as a "mini Photoshop" for translated text. For flexibility, it supports OpenAPI-compatible APIs, meaning users can connect it to local servers like LM Studio or cloud services like OpenRouter instead of running models locally. As a fully open-source project hosted on GitHub, Koharu represents a significant step in democratizing high-quality, automated translation for niche media, moving beyond simple text extraction to a complete, editable production tool.

Key Points
  • Built in Rust with llama.cpp integration, supporting Gemma 4 and Qwen3.5 model families.
  • Automates a full pipeline: object detection, visual LLM-based OCR, layout analysis, and inpainting.
  • Includes an editing suite for proofreading and typesetting, with OpenAPI support for external LLM services.

Why It Matters

Democratizes high-quality manga translation by automating the entire technical pipeline, saving hours of manual work.