Developer Tools

FlyCatcher: Neural Inference of Runtime Checkers from Tests

Automatically detects silent failures in complex software without manual checker writing.

Deep Dive

FlyCatcher, developed by Beatriz Souza, Chang Lou, Suman Nath, and Michael Pradel, tackles the problem of silent failures in complex software—bugs that violate intended semantics without explicit crashes. The system automatically infers runtime checkers from existing test suites, a resource already available for most software. It combines LLM-based code synthesis with static analysis and dynamic validation to generate checkers that monitor specific method calls and assert behavioral properties during execution. These checkers are stateful, maintaining a shadow state that abstracts the actual system state to reason about behavior.

In evaluations across 400 tests from four popular, complex systems, FlyCatcher inferred 334 checkers, with 300 validated as correct via cross-validation. Compared to state-of-the-art approaches, it produced 2.6x more correct checkers and detected 5.2x more errors. This leap in automation could make runtime checking practical for everyday development, helping engineers catch silent failures that traditional testing misses, without the manual effort of writing custom checkers.

Key Points
  • FlyCatcher uses LLMs, static analysis, and dynamic validation to infer runtime checkers from existing tests.
  • Evaluated on 400 tests from four complex systems, it generated 334 checkers (300 correct), a 2.6x improvement over prior methods.
  • Detected 5.2x more errors than state-of-the-art, addressing silent failures that cause no explicit crashes.

Why It Matters

Automates runtime checker creation, catching silent bugs in production without manual effort.