Open Source

Ettin Reranker Family: six SOTA cross-encoders from 17M to 1B

New Sentence Transformers rerankers beat state-of-the-art at every size with distillation.

Deep Dive

Tom Aarsen has released the Ettin Reranker Family, six new Sentence Transformers CrossEncoder rerankers ranging from 17M to 1B parameters. Built on top of the Ettin ModernBERT encoders, these models achieve state-of-the-art performance at their respective sizes. The training used a distillation recipe: pointwise MSE on mixedbread-ai/mxbai-rerank-large-v2 scores over a curated dataset combining pre-training and fine-tuning data. The smallest 17M variant is highly efficient while the 1B model delivers top accuracy, making the family suitable for resource-constrained to high-end deployments. All models are available on Hugging Face under the cross-encoder namespace.

Rerankers (pointwise cross-encoders) offer higher accuracy than embedding models by jointly encoding query and document pairs, but with higher per-pair cost. The standard production pattern is retrieve-then-rerank: a fast embedding model retrieves top-K candidates, then a cross-encoder re-orders them. The Ettin rerankers integrate seamlessly with Sentence Transformers: load a model in three lines and use predict() or rank(). When paired with google/embeddinggemma-300m on MTEB (eng, v2) Retrieval, they achieve leading results. The release also includes full training data and a Distillation recipe, plus a new Agent Skill for fine-tuning custom rerankers via AI coding agents (Claude Code, Codex, etc.). This open-source contribution empowers professionals to deploy high-quality reranking without proprietary dependencies.

Key Points
  • Six sizes: 17M, 32M, 68M, 150M, 400M, and 1B parameters, all SOTA at their size.
  • Trained via pointwise MSE distillation using mxbai-rerank-large-v2 scores over a curated dataset.
  • Achieves top results on MTEB (eng, v2) Retrieval when paired with embedding models like google/embeddinggemma-300m.

Why It Matters

Open-source, SOTA rerankers enable cheap, high-accuracy retrieval pipelines for search and RAG systems.