Covers 4 heterogeneous categories?

Publications, Research Data, Variables, Instruments & Tools

Built on real user queries from a social science search platform, not synthetic data?

Built on real user queries from a social science search platform, not synthetic data

Uses LLMs to auto-generate topic descriptions and relevance assessments, reducing collection cost?

Uses LLMs to auto-generate topic descriptions and relevance assessments, reducing collection cost

Research & Papers

MIRA benchmark uses LLMs to evaluate cross-category search across 4 data types

arXiv cs.IR May 13, 2026

⚡A new benchmark tests search across publications, data, variables, and tools—powered by LLMs.

Deep Dive

MIRA (Multi-category Integrated Retrieval Assessment) is a new benchmark accepted at SIGIR 2026 that addresses the growing need for search systems capable of returning results from diverse data sources. Built upon a large-scale social science search platform, MIRA covers four distinct categories—Publications, Research Data, Variables, and Instruments & Tools—within a single evaluation framework. The benchmark uses real user queries rather than synthetic ones, making it more representative of actual search behavior.

A key innovation is the use of LLMs to generate topic descriptions, narratives, and relevance assessments, dramatically reducing the manual effort typically required for creating test collections. MIRA provides the research community with a foundational testbed for studying multi-faceted, category-aware, and integrated information retrieval. The dataset includes 8 pages of description and 2 figures, with a DOI linked to the ACM proceedings.

Key Points

Covers 4 heterogeneous categories: Publications, Research Data, Variables, Instruments & Tools
Built on real user queries from a social science search platform, not synthetic data
Uses LLMs to auto-generate topic descriptions and relevance assessments, reducing collection cost

Why It Matters

Enables realistic evaluation of unified search across scholarly data types, pushing IR systems toward practical multi-source retrieval.

Read Original Article

MIRA benchmark uses LLMs to evaluate cross-category search across 4 data types

Why It Matters

Related Articles

🚀 Stay Ahead in AI