NuExtract3: Open-weight 4B VLM for document extraction and Markdown conversion
Convert PDFs, invoices, and tables to Markdown with a 4GB VRAM-friendly model.
Get AI news that actually matters
One email a day. Zero fluff. Join 10,000+ professionals.
Numind, the company behind NuMarkdown, has released NuExtract3—a 4B-parameter open-weight vision-language model (VLM) built on Qwen3.5-4B. It's designed for converting complex document images (PDFs, screenshots, forms, tables, receipts, invoices) into Markdown format and extracting structured data using a user-defined JSON template. The model is released under an Apache-2.0 license, making it freely usable for any task.
Trained on a single node of 8xH100 GPUs for three days, NuExtract3 handles long documents well but is optimized for page-by-page processing for better speed and parallelization. It requires as little as 4GB VRAM, thanks to multiple quantization options (GPTQ, W8A8, FP8, Q4, Q6). Weights are available in Safetensors, GGUF, and MLX formats, supporting inference engines like vLLM, SGLang, and llama.cpp. A free Hugging Face space is available without sign-up, and detailed documentation and a blog post accompany the release. The creators are preparing a paper for peer review.
- 4B-parameter VLM based on Qwen3.5-4B, Apache-2.0 license, successor to NuMarkdown-8B.
- Converts PDFs, tables, and invoices to Markdown and extracts structured data via JSON templates.
- Self-hostable with as little as 4GB VRAM via quantized weights (GGUF, MLX, GPTQ) and supports vLLM, SGLang, llama.cpp.
Why It Matters
Enables local, open-source document extraction and Markdown conversion without cloud dependency or high hardware costs.