Research & Papers

Unleashing Low-Bit Inference on Ascend NPUs: A Comprehensive Evaluation of HiFloat Formats

This breakthrough could dramatically speed up AI on Huawei's hardware...

Deep Dive

A new research paper evaluates HiFloat, a family of low-bit floating-point formats (HiF8 and HiF4) tailored for Huawei's Ascend NPUs. The key finding is that in the critical 4-bit regime, HiF4's hierarchical scaling prevents the catastrophic accuracy collapse seen with traditional integer formats. The formats are fully compatible with modern post-training quantization frameworks, offering a practical path to high-efficiency, large language model inference on specialized AI accelerators.

Why It Matters

It enables more powerful and efficient AI models to run on Huawei's ecosystem, challenging NVIDIA's dominance.