Inference-time attention calibration with lambda interpolation (no retraining needed) reduces positional bias in dense retrieval?

Inference-time attention calibration with lambda interpolation (no retraining needed) reduces positional bias in dense retrieval.

Optimal configuration?

B=128, lambda=0.5, 50% layer depth—improves nDCG@10 across all positional groups on FineWeb-PosQ for three models.

Transfers to 10-language, 31-domain PosIR benchmark, reducing Position Sensitivity Index in all 16 tested combinations while maintaining retrieval quality?

Transfers to 10-language, 31-domain PosIR benchmark, reducing Position Sensitivity Index in all 16 tested combinations while maintaining retrieval quality.

Research & Papers

Attention calibration eliminates positional bias in dense retrieval

arXiv cs.IR June 03, 2026

⚡No retraining needed: simple tweak makes models fairer across passages.

Deep Dive

Dense retrieval models often suffer from positional bias—relevant information appearing later in a passage gets less attention, degrading retrieval effectiveness. Andrianos Michail and colleagues adapt inference-time attention calibration (previously introduced by Schuhmacher et al., 2026) to downstream retrieval, adding a strength coefficient lambda that interpolates between original and fully calibrated attention distributions. This allows fine-grained control over fairness without retraining.

Testing on SQuAD-PosQ and FineWeb-PosQ across three embedding models, the team found partial calibration (B=128, lambda=0.5, 50% layer depth) outperforms full calibration, improving harmonic mean of nDCG@10 across positional groups for all models without per-model tuning. The method transfers to PosIR, which spans 10 languages and 31 domains, reducing the Position Sensitivity Index in all 16 length-quartile × model × retrieval-setting combinations while preserving or improving aggregate nDCG@10. The code is released on GitHub, offering a practical, drop-in fix for fairer retrieval.

Key Points

Inference-time attention calibration with lambda interpolation (no retraining needed) reduces positional bias in dense retrieval.
Optimal configuration: B=128, lambda=0.5, 50% layer depth—improves nDCG@10 across all positional groups on FineWeb-PosQ for three models.
Transfers to 10-language, 31-domain PosIR benchmark, reducing Position Sensitivity Index in all 16 tested combinations while maintaining retrieval quality.

Why It Matters

Fairer search results without costly retraining, improving retrieval quality across languages and domains.

Read Original Article

Attention calibration eliminates positional bias in dense retrieval

Why It Matters

Related Articles

🚀 Stay Ahead in AI