Developer Tools

Release 5.8.0

DeepSeek-V4, Gemma 4 Assistant, and IBM GraniteVision arrive in latest update.

Deep Dive

Hugging Face released Transformers v5.8.0 adding six major models: DeepSeek-V4 (MoE with hybrid local+long-range attention and Manifold-Constrained Hyper-Connections), Gemma 4 Assistant (speculative decoding via Multi-Token Prediction), IBM's Granite Speech Plus (speech-to-text with speaker annotation and word timestamps) and Granite Vision 4.1 (enterprise document extraction), LG's EXAONE 4.5 (33B-parameter open-weight VLM with 256K-token context and MTP), and PP-FormulaNet (table structure recognition). The release also removes Apex integration as a breaking change, advising migration to PyTorch native equivalents.

Key Points
  • DeepSeek-V4 introduces hybrid local+long-range attention and Manifold-Constrained Hyper-Connections, with 256K context support.
  • Gemma 4 Assistant enables speculative decoding via Multi-Token Prediction, reusing KV cache to skip pre-fill for faster generation.
  • IBM's Granite4Vision offers enterprise document extraction (charts, tables, KVP) using SigLIP2 vision encoder and 8 injection points.

Why It Matters

Hugging Face consolidates frontier MoE, speculative decoding, and multimodal models in one release, accelerating research and production deployment.