FeedbackLLM: Metadata driven Multi-Agentic Language Agnostic Test Case Generator with Evolving prompt and Coverage Feedback
Two specialized LLM agents work in tandem to fix missed lines and branches.
Traditional test case generation relies on manual effort or single-shot LLM prompts, which often miss branches and produce redundant cases. To address this, a team of researchers introduces FeedbackLLM, a language-agnostic framework that tightly couples two stages: first, it parses source code to extract input constraints and generates preliminary test cases; second, it employs two specialized LLM agents—Line Feedback Agent and Branch Feedback Agent—that evaluate coverage and feed metadata back into the generator. This iterative process runs for k steps, with agents communicating in tandem to progressively improve coverage. A redundancy prevention cache avoids duplicate API calls.
FeedbackLLM was tested on standard C and Python benchmark programs. Results show it consistently outperforms baseline tools in both line and branch coverage, while execution time grows only linearly—a significant improvement over exponential or manual methods. The multi-agent design reduces hallucination and eliminates redundant test cycles, making it suitable for complex, real-world software systems. This approach could automate a traditionally labor-intensive part of software engineering, especially for CI/CD pipelines and large codebases.
- Two LLM feedback agents (Line and Branch) iteratively improve coverage over k steps.
- Redundancy prevention cache eliminates duplicate API requests and execution cycles.
- Achieves higher line and branch coverage than baselines on C and Python benchmarks, with linear time scaling.
Why It Matters
Automates software testing with scalable, high-coverage test generation, reducing manual effort and hallucination.