Developer Tools

Automatic, Expressive, and Scalable Fuzzing with Stitching

New 'stitching' technique uses LLMs to automatically find 3x more bugs than other tools combined.

Deep Dive

A team from Carnegie Mellon University has introduced STITCH, a breakthrough AI-powered fuzzing system that dramatically improves software testing automation. The core innovation is 'stitching'—a technique that encodes API usage constraints in modular pieces that a fuzzer dynamically assembles at runtime. This approach combines static type checking with dynamic typestate tracking, allowing specifications to express rich semantic constraints like object state dependencies and cross-function preconditions.

STITCH leverages large language models (LLMs) to automate the entire fuzzing pipeline: configuring projects, synthesizing specifications from code, triaging crashes, and even repairing specifications. In rigorous testing against four state-of-the-art tools on 33 benchmarks, STITCH achieved superior results—highest code coverage on 21 benchmarks and found 30 true-positive bugs compared to just 10 by all other tools combined. Most impressively, it maintained 70% precision versus 12% for the next-best LLM-based tool.

The system's real-world impact is substantial. When automatically deployed on 1,365 widely used open-source projects, STITCH discovered 131 previously unknown bugs across 102 projects, with 73 already patched by maintainers. This represents a 3x improvement in bug discovery compared to existing automated approaches. The research addresses fundamental limitations in current fuzzing techniques, which either commit to fixed API sequences too early or lack expressiveness for real-world constraints.

Key Points
  • STITCH found 30 true-positive bugs vs. 10 by all other tools combined in benchmark testing
  • Achieved 70% precision rate compared to 12% for next-best LLM-based fuzzing tool
  • Discovered 131 new bugs across 102 projects when deployed on 1,365 open-source repositories

Why It Matters

Automates high-quality software testing at scale, potentially preventing security vulnerabilities in widely used open-source libraries.