Developer Tools

Reproduction Test Generation for Java SWE Issues

250 Java bugs get automated reproduction tests, closing the Python gap...

Deep Dive

The paper introduces TDD-Bench-Java, the first benchmark for repository-level reproduction test generation in Java, with 250 instances from popular open-source repositories. Its solution, e-Otter++ for Java, adapts a state-of-the-art Python reproduction test generator to create execution-based tests that confirm bug presence before fixes and absence after. Results include both empirical performance on TDD-Bench-Java and validation on a contamination-free proprietary dataset, promising better diagnosis and validation for Java software development.

Key Points
  • TDD-Bench-Java is the first reproduction test generation benchmark for Java, with 250 instances from popular open-source repos
  • e-Otter++ for Java adapts a Python SOTA generator to produce execution-based tests for bug verification
  • Validation includes both benchmark results and a contamination-free proprietary dataset from industry

Why It Matters

Automates bug test creation for Java, the backbone of enterprise software, speeding up diagnosis and validation.