Falkor-IRAC uses graph-constrained generation to verify legal reasoning
A new framework eliminates hallucinated precedents by enforcing graph-based verification.
Legal reasoning isn't just semantic similarity—it involves precedent propagation, procedural state transitions, and statute-bound inference. Vector-based RAG systems often hallucinate precedents or cite outdated statutes, a critical problem in high-caseload jurisdictions like India. Joy Bose's new paper introduces Falkor-IRAC, a framework that replaces open-ended generation with graph-constrained generation. It ingests judgments from India's Supreme Court and High Courts into an IRAC knowledge graph stored in FalkorDB, capturing Issue, Rule, Analysis, and Conclusion nodes along with procedural state transitions and statutory references. At inference time, a ‘Verifier Agent’ acts as a falsifiability oracle: LLM-generated answers are only accepted if a valid supporting path can be traced through the graph. The system also surfaces doctrinal conflicts as a first-class output rather than silently ignoring them.
Falkor-IRAC is evaluated using graph-native metrics—citation grounding accuracy, path validity rate, hallucinated precedent rate, and conflict detection rate—which the author argues are more appropriate than BLEU or ROUGE for legal reasoning. On a proof-of-concept corpus of 51 Supreme Court judgments, the Verifier Agent correctly validated citations on completed queries and correctly rejected fabricated citations. The paper acknowledges that comparison against vector-only RAG baselines is left for future work, as is GPU-accelerated inference to address current CPU timeout rates. The work is published as a 20-page arXiv preprint (arXiv:2605.14665) and represents a promising direction for verified AI in judicial settings.
- Falkor-IRAC uses a graph-constrained generation approach, not semantic similarity, to enforce legal reasoning chains.
- A Verifier Agent checks LLM outputs against an IRAC knowledge graph, accepting only answers with valid supporting paths.
- Tested on 51 Indian Supreme Court judgments; the system correctly validated real citations and rejected fabricated ones.
Why It Matters
Could reduce AI hallucinations in legal domains, improving access to justice in high-caseload jurisdictions like India.