Open Source

IBM's Granite Embedding R2 delivers best sub-100M multilingual retrieval with 32K context

Two new Apache 2.0 models beat every open sub-100M multilingual embedder on MTEB.

Deep Dive

IBM's Granite Embedding Multilingual R2 release tackles the trade-off between language coverage and model size. The new family includes two Apache 2.0 models: a compact 97M-parameter model with 384-dimensional embeddings and a full-size 311M-parameter model with 768-dimensional embeddings plus Matryoshka support. Both are built on ModernBERT and handle context lengths up to 32,768 tokens – a 64x increase over their R1 predecessors. They cover 200+ languages, with explicit retrieval training for 52 languages and code retrieval across Python, Go, Java, and six other languages. The models ship with ONNX and OpenVINO weights for CPU-optimized inference.

On the MTEB Multilingual Retrieval benchmark, the 97M model scores 60.3 – the best result for any open sub-100M multilingual embedding model. The 311M model scores 65.2, ranking second among open models under 500M parameters. Both can be used as drop-in replacements in popular frameworks like LangChain, LlamaIndex, Haystack, and Milvus with a simple one-line model name change. IBM intentionally avoided using MS-MARCO training data, instead curating a mix of IBM-owned, public, and synthetic datasets with strict quality and governance filters, making the models enterprise-ready with reduced commercial risk.

Key Points
  • 97M model scores 60.3 on MTEB Multilingual Retrieval – best among all open sub-100M models.
  • 311M model scores 65.2, ranking #2 under 500M parameters, with Matryoshka dimension support.
  • Both models support 200+ languages, 32K-token context (64x predecessor), and code retrieval for 9 programming languages.

Why It Matters

Enterprise teams can now deploy a fast, open multilingual embedding model for RAG without sacrificing language coverage or commercial licensing.