Open Source

Unsloth solved bug in Mistral Medium 3.5 implementation

A YaRN parsing quirk was causing errors in multiple implementations, now resolved.

Deep Dive

Unsloth, the open-source fine-tuning library, announced a collaborative fix with Mistral for a critical bug in the Mistral Medium 3.5 model. The issue manifested as incorrect inference behavior in several popular implementations, including Hugging Face’s transformers and llama.cpp. The root cause was traced to a YaRN (Yet another RoPE extensioN) parsing quirk. By changing the mscale_all_dim parameter from 1 to 0, the team resolved the inference errors. Updated GGUF files incorporating the fix have been released. Additionally, the update fixed improperly generated mmproj files, which are necessary for certain multimodal use cases. The bug was unrelated to Unsloth’s own quantization methods or library. Users deploying Mistral Medium 3.5 (a 1.3B parameter model optimized for fast inference) should upgrade to the latest GGUF files to avoid incorrect outputs. The collaboration highlights the importance of coordinated debugging between model developers and inference tooling maintainers.

Key Points
  • Unsloth and Mistral identified a YaRN parsing quirk causing inference errors in Mistral Medium 3.5.
  • The fix required changing mscale_all_dim from 1 to 0; updated GGUF files are now available.
  • The bug affected multiple implementations including transformers and llama.cpp, but not Unsloth’s own quants.

Why It Matters

This fix ensures reliable inference for Mistral Medium 3.5 users, a key model for efficient local AI deployment.