FlexStructRAG: Flexible Structure-Aware Multi-Granular Relational Retrieval for RAG
New RAG framework uses knowledge graphs, hypergraphs, and clusters to retrieve evidence at four different levels.
A team of researchers has introduced FlexStructRAG, a novel framework designed to solve a core limitation in current Retrieval-Augmented Generation (RAG) systems. Most existing RAG approaches either retrieve fixed-length text chunks, which can fragment important context, or commit to a single structured index like a knowledge graph, locking them into one type of relational granularity. This makes them brittle when queries require different forms of evidence, such as simple binary facts, complex multi-entity interactions, or broader document-level context.
FlexStructRAG addresses this by jointly constructing three heterogeneous knowledge representations: a knowledge graph for binary relations, a knowledge hypergraph for n-ary (multi-entity) relations, and structure-aware semantic clusters that aggregate evidence into document-grounded units. To combat the semantic fragmentation of uniform text chunking, it uses dynamic partitioning and a truncated sliding-window mechanism to preserve contextual dependencies during knowledge construction.
At inference time, the framework's key innovation is enabling query-adaptive, multi-granular retrieval. It can pull evidence at four distinct levels—entities, graph edges, hyperedges, and entire clusters—and flexibly combine them to supply the language model with relationally and contextually aligned information. This adaptability allows the system to match the evidence type to the query's specific needs, a significant advancement over one-size-fits-all retrieval.
The paper, submitted to arXiv, reports that experiments on the UltraDomain benchmark across four domains show FlexStructRAG improves semantic evaluation metrics over strong existing RAG baselines. Further ablation studies confirm the necessity of its multi-granular approach and structure-aware clustering, validating the core design principles behind the framework.
- Constructs three knowledge representations: a graph for binary relations, a hypergraph for n-ary relations, and semantic clusters for document context.
- Enables four-level retrieval (entity, edge, hyperedge, cluster) that can be combined adaptively based on the query.
- Outperforms strong RAG baselines on the UltraDomain benchmark, with ablations proving the need for its multi-granular design.
Why It Matters
Makes RAG systems more robust and accurate by retrieving the right type of evidence—facts, complex relations, or broad context—for each query.