4B-parameter VLM based on Qwen3.5-4B, Apache-2.0 license, successor to NuMarkdown-8B?

4B-parameter VLM based on Qwen3.5-4B, Apache-2.0 license, successor to NuMarkdown-8B.

Converts PDFs, tables, and invoices to Markdown and extracts structured data via JSON templates?

Converts PDFs, tables, and invoices to Markdown and extracts structured data via JSON templates.

Self-hostable with as little as 4GB VRAM via quantized weights (GGUF, MLX, GPTQ) and supports vLLM, SGLang, llama.cpp?

Self-hostable with as little as 4GB VRAM via quantized weights (GGUF, MLX, GPTQ) and supports vLLM, SGLang, llama.cpp.

Open Source

NuExtract3: Open-weight 4B VLM for document extraction and Markdown conversion

r/LocalLLaMA May 25, 2026

⚡Convert PDFs, invoices, and tables to Markdown with a 4GB VRAM-friendly model.

Deep Dive

Numind, the company behind NuMarkdown, has released NuExtract3—a 4B-parameter open-weight vision-language model (VLM) built on Qwen3.5-4B. It's designed for converting complex document images (PDFs, screenshots, forms, tables, receipts, invoices) into Markdown format and extracting structured data using a user-defined JSON template. The model is released under an Apache-2.0 license, making it freely usable for any task.

Trained on a single node of 8xH100 GPUs for three days, NuExtract3 handles long documents well but is optimized for page-by-page processing for better speed and parallelization. It requires as little as 4GB VRAM, thanks to multiple quantization options (GPTQ, W8A8, FP8, Q4, Q6). Weights are available in Safetensors, GGUF, and MLX formats, supporting inference engines like vLLM, SGLang, and llama.cpp. A free Hugging Face space is available without sign-up, and detailed documentation and a blog post accompany the release. The creators are preparing a paper for peer review.

Key Points

4B-parameter VLM based on Qwen3.5-4B, Apache-2.0 license, successor to NuMarkdown-8B.
Converts PDFs, tables, and invoices to Markdown and extracts structured data via JSON templates.
Self-hostable with as little as 4GB VRAM via quantized weights (GGUF, MLX, GPTQ) and supports vLLM, SGLang, llama.cpp.

Why It Matters

Enables local, open-source document extraction and Markdown conversion without cloud dependency or high hardware costs.

Read Original Article

NuExtract3: Open-weight 4B VLM for document extraction and Markdown conversion

Why It Matters

Related Articles

🚀 Stay Ahead in AI