Developer Tools

Hugging Face Transformers v5.6 adds Baidu's 4B-parameter Qianfan-OCR and OpenAI Privacy Filter

The latest release introduces four new specialized models for document AI, privacy, segmentation, and table recognition.

Deep Dive

Hugging Face has launched Transformers library version 5.6.0, a significant update introducing four new specialized AI models. The standout addition is Baidu's Qianfan-OCR, a 4-billion parameter end-to-end document intelligence model that performs direct image-to-text conversion, eliminating traditional multi-stage OCR pipelines. It supports prompt-driven tasks like table extraction, chart understanding, and document Q&A through a unique "Layout-as-Thought" capability. Another major inclusion is OpenAI's Privacy Filter, a bidirectional token-classification model designed for on-premises, high-throughput detection and masking of personally identifiable information (PII) across eight privacy categories.

The release also features two efficiency-focused models: SAM3-LiteText, a lightweight variant that reduces text encoder parameters by 88% for vision-language segmentation, and SLANet, a CPU-friendly model from Baidu's PaddlePaddle team for fast table structure recognition. Beyond new models, v5.6.0 brings breaking changes to the internal `rotary_fn` and major enhancements to the `transformers serve` command. The serving updates include a new `/v1/completions` endpoint for legacy OpenAI-style completions, multimodal support for audio and video inputs, improved tool-calling via `parse_response`, and better error handling for model mismatches.

Key Points
  • Adds Baidu's Qianfan-OCR, a 4B-parameter model for unified document parsing and image-to-text conversion.
  • Introduces OpenAI Privacy Filter for on-premises, high-speed PII detection and masking across 8 categories.
  • Enhances `transformers serve` with a legacy completions endpoint and multimodal audio/video input support.

Why It Matters

Provides developers with production-ready, specialized models for document intelligence, data privacy, and efficient multimodal tasks directly within the popular Transformers ecosystem.

📬 Get the top 10 AI stories daily