Beyond Augmented-Action Surrogates for Multi-Expert Learning-to-Defer
New method solves a critical flaw in AI systems that decide which expert model to use.
A team of researchers has identified and solved a fundamental flaw in how modern AI systems decide which specialized model, or 'expert', should handle a given task. This process, known as Learning-to-Defer, traditionally assumes the information available to each expert is fixed. However, in real systems—like those using retrieval-augmented generation (RAG) or tool-calling agents—you can choose what additional context (documents, tool outputs) to provide *after* selecting the expert. The researchers formalize this as the 'Learning-to-Defer with Advice' problem.
They proved that a common, intuitive approach of using separate components for routing and for generating advice is mathematically inconsistent and fails to recover the optimal policy. In response, they developed a new 'augmented surrogate' method that treats the choice of expert and the advice as a single, combined decision. This approach comes with a proven H-consistency guarantee, ensuring it converges to the theoretically optimal, cost-minimizing policy.
Experiments across tabular, language, and multi-modal tasks demonstrated that their method outperforms standard Learning-to-Defer. Crucially, it intelligently adapts how much 'advice' it gathers based on the cost of being wrong versus the cost of acquiring more information. A synthetic benchmark confirmed the predicted failure mode of the older, separated approach, validating their theoretical findings.
- Fixes a flaw in AI routing where systems can't dynamically choose what info to give a selected expert model (e.g., GPT-4, Claude).
- Proves common 'separated surrogate' methods are inconsistent; new 'augmented surrogate' method guarantees recovery of the Bayes-optimal policy.
- Tested on language and multi-modal tasks, showing improved performance and adaptive advice-acquisition based on cost.
Why It Matters
Enables more efficient and accurate AI systems that better orchestrate multiple models and tools, reducing costs and errors.