Research & Papers

RMIT-ADM+S at the MMU-RAG NeurIPS 2025 Competition

The award-winning system runs complex research tasks on a single consumer GPU using smaller, efficient LLMs.

Deep Dive

A consortium of researchers from RMIT University's ADM+S center has secured a top award at a prestigious NeurIPS 2025 competition with a novel approach to retrieval-augmented generation (RAG). Their system, dubbed Routing-to-RAG (R2RAG), won the Best Dynamic Evaluation award in the Open Source category of the MMU-RAG Competition. This victory builds on the team's prior success with the G-RAG system, which won the ACM SIGIR 2025 LiveRAG Challenge, demonstrating a consistent research trajectory focused on making RAG systems more intelligent and adaptive. The core innovation addresses a key bottleneck in RAG: applying the same intensive retrieval process to every query, regardless of its complexity.

The R2RAG architecture introduces a 'routing' mechanism that dynamically selects a retrieval strategy by inferring the complexity of a user's query and the sufficiency of available evidence. This allows the system to use computational resources more judiciously, avoiding over-processing simple requests. A major practical breakthrough is its ability to support complex research tasks while running on a single consumer-grade GPU, achieved by strategically employing smaller, more efficient large language models (LLMs). The design was further refined through qualitative review of outputs, adding a human-in-the-loop element to system optimization. This combination of smart routing, hardware efficiency, and iterative refinement presents a compelling blueprint for the next generation of RAG systems, prioritizing performance-per-watt and accessibility alongside raw accuracy.

Key Points
  • Won 'Best Dynamic Evaluation' at the NeurIPS 2025 MMU-RAG Competition's Text-to-Text track.
  • Introduces 'Routing-to-RAG' (R2RAG), which dynamically adapts retrieval strategy based on query complexity.
  • Designed to run complex research tasks on a single consumer GPU using smaller, efficient LLMs.

Why It Matters

It demonstrates a path to powerful, research-grade AI systems that are dramatically more accessible and cost-effective to run.