Agentic Retrieval-Augmented Generation for Financial Document Question Answering
New agentic RAG framework cracks complex financial documents using Python code and self-verification loops.
Financial document QA requires multi-step numerical reasoning across tables, narratives, and footnotes—a challenge that single-pass RAG systems fail to handle. The new FinAgent-RAG framework tackles this with an iterative retrieval-reasoning loop that self-verifies answers. It introduces three innovations: a Contrastive Financial Retriever trained with hard negative mining to distinguish semantically similar but numerically distinct passages, a Program-of-Thought module that writes and executes Python code for precise arithmetic instead of relying on error-prone LLM mental math, and an Adaptive Strategy Router that dynamically allocates compute based on question complexity.
Tested on three benchmarks, FinAgent-RAG achieves 76.81% (FinQA), 78.46% (ConvFinQA), and 74.96% (TAT-QA) execution accuracy—outperforming the strongest baseline by 5.62 to 9.32 percentage points. The router alone reduces API costs by 41.3% on FinQA without sacrificing performance. Cross-backbone evaluations with four LLMs confirm robustness, and the framework is designed for practical deployment in financial institutions. The paper is 22 pages with 11 figures and 13 tables, submitted to Expert Systems with Applications.
- Achieves 76.81%, 78.46%, and 74.96% execution accuracy on FinQA, ConvFinQA, and TAT-QA benchmarks.
- Outperforms strongest baseline by 5.62–9.32 percentage points across all three datasets.
- Adaptive Strategy Router reduces API costs by 41.3% on FinQA while preserving accuracy.
Why It Matters
Financial analysts can now reliably query complex filings with precise numerical reasoning, saving time and cutting API costs.