Research & Papers

[P] ColQwen3.5-v1 4.5B SOTA on ViDoRe V1 (nDCG@5 0.917)

A new 4.5B parameter model achieves state-of-the-art performance on a key document retrieval benchmark.

Deep Dive

Independent developer madkimchi has released ColQwen3.5-v1, a new open-source AI model achieving state-of-the-art performance on a key document retrieval benchmark. The 4.5 billion parameter model is built on the Qwen3.5-4B foundation and implements the ColPali late-interaction architecture, a technique designed for efficient and accurate retrieval. It currently ranks #1 on the ViDoRe V1 benchmark with a normalized Discounted Cumulative Gain (nDCG@5) score of 0.917, indicating top-tier accuracy in finding relevant documents, and shows competitive results on the newer ViDoRe V3 benchmark.

The model was trained across four specialized phases, including hard negative mining to improve its discrimination between similar documents and domain specialization focused on financial and tabular data. Released under the permissive Apache 2.0 license, the model weights are available on Hugging Face, and a pull request has been submitted for integration into the main ColPali repository on GitHub. The developer is already working on a v2 iteration aimed at simplifying the training process, expanding domain coverage, and ultimately claiming the top spot on the ViDoRe V3 benchmark as well.

Key Points
  • Achieves #1 rank on ViDoRe V1 benchmark with a 0.917 nDCG@5 score.
  • A 4.5B parameter model built on Qwen3.5-4B using the ColPali late-interaction architecture.
  • Open-source (Apache 2.0) and available on Hugging Face, with a v2 already in development.

Why It Matters

Provides developers with a powerful, open-source tool for accurate retrieval of complex financial and tabular documents.