Developer Tools

ARISE: A Repository-level Graph Representation and Toolset for Agentic Fault Localization and Program Repair

New graph-based toolset improves fault localization by 17 points on SWE-bench Lite

Deep Dive

A new research paper from Shahd Seddik and Fatemeh Fard presents ARISE (Agentic Repository-level Issue Solving Engine), a graph-based system that extends traditional structural repository representations down to statement-level nodes with definition-use edges. This data-flow graph, exposed through a three-tier tool API, turns variable slicing into a first-class queryable primitive for LLM-based agents. The approach allows agents to trace data dependencies across functions and lines without needing separate summarization steps, directly consuming structured slice outputs.

Evaluated on SWE-bench Lite (300 real GitHub issues from 11 Python repositories) with Qwen2.5-Coder-32B-Instruct as the backbone, ARISE outperforms the unmodified SWE-agent baseline significantly. Function Recall@1 improved by 17.0 percentage points, Line Recall@1 by 15.0 points, and Pass@1 repair success reached 22.0% (66/300), a 4.7-point absolute gain. Controlled ablations confirm that the data-flow graph, not just the tool schema, drives the improvement. The graph builder and slicing API are released as a framework-agnostic drop-in toolset for future APR research.

Key Points
  • ARISE adds intra-procedural data-flow edges (definition-use chains) to repository-level program graphs for precise line-level analysis.
  • Achieves 22.0% Pass@1 on SWE-bench Lite (66/300), a 4.7-point improvement over SWE-agent, with Function Recall@1 up 17 points.
  • Provides a three-tier tool API making data-flow slicing a first-class query for LLM agents, removing need for natural-language summarization.

Why It Matters

ARISE gives AI coding agents the data-flow precision needed to fix real bugs across files, boosting automated repair rates significantly.