Don't Retrieve, Navigate: Distilling Enterprise Knowledge into Navigable Agent Skills for QA and RAG
New method distills documents into a hierarchical directory, letting AI agents browse like humans to find answers.
A team of researchers has introduced Corpus2Skill, a novel framework that fundamentally rethinks how AI agents interact with enterprise knowledge bases. Instead of the standard Retrieval-Augmented Generation (RAG) approach where a model passively receives search results, Corpus2Skill first distills an entire document corpus offline. This compilation process involves iteratively clustering documents and generating LLM-written summaries at each level of a hierarchy, materializing the final result as a navigable tree of 'skill' files.
At serve time, this changes the game for the AI agent. The agent receives a bird's-eye view of the entire knowledge corpus's organization. It can then actively navigate the explicitly visible hierarchy: drilling into specific topic branches via progressively finer summaries, backtracking from unproductive paths, and strategically combining evidence scattered across different branches. This mirrors human reasoning when browsing a well-organized library or website sitemap.
The research demonstrates significant performance gains. On WixQA, an enterprise customer-support benchmark designed for RAG systems, Corpus2Skill outperformed established baselines including standard dense retrieval (like using vector databases), the RAPTOR method (which also uses recursive summaries), and other agentic RAG approaches. The key advantage is giving the LLM agent spatial and relational awareness of the knowledge base, moving it from a passive consumer to an active, reasoning explorer.
- Corpus2Skill compiles documents offline into a hierarchical 'skill directory' of LLM-written summaries, creating a navigable map of knowledge.
- At query time, AI agents can see and reason over this hierarchy, allowing for strategic navigation, backtracking, and evidence combination.
- The method outperformed dense retrieval, RAPTOR, and agentic RAG baselines on the WixQA enterprise support benchmark across all quality metrics.
Why It Matters
This could make enterprise AI assistants and customer support bots significantly more accurate and reliable by enabling human-like browsing of complex knowledge bases.