Finding Memory Leaks in C/C++ Programs via Neuro-Symbolic Augmented Static Analysis
New AI tool combines LLMs and Z3 reasoning to boost static analyzers, finding 47 confirmed bugs at $1.70 each.
A research team has introduced MemHint, a novel neuro-symbolic pipeline designed to dramatically improve the detection of memory leaks in large-scale C/C++ codebases. The system directly addresses the primary weaknesses of traditional static analyzers like CodeQL and Infer, which often miss bugs because they cannot recognize project-specific memory management functions and lack sophisticated path-sensitive analysis. MemHint's core innovation is its two-stage process: first, a large language model (LLM) parses the codebase to classify functions as allocators, deallocators, or neither, creating summaries of memory ownership. These summaries are then rigorously validated using Z3, a symbolic reasoning engine, to check the feasibility of the claimed memory operations against the program's control-flow graph.
The validated summaries are injected back into the static analyzers, extending their knowledge beyond standard functions like `malloc` and `free`. A final Z3-based step filters out warnings on infeasible execution paths, and a concluding LLM validation confirms genuine bugs. In a comprehensive evaluation across seven real-world projects totaling over 3.4 million lines of code, MemHint proved its superior efficacy. It detected 52 unique memory leaks, leading to 47 confirmed and fixed bugs and the submission of 4 CVEs (Common Vulnerabilities and Exposures). This performance starkly outperformed the baseline tools, which found only 19 and 3 leaks respectively, and did so at an estimated cost of approximately $1.70 per successfully detected bug, showcasing a highly cost-effective approach to improving software security and reliability.
- Combines LLM semantic understanding with Z3 symbolic reasoning to augment static analyzers CodeQL and Infer.
- Found 52 unique memory leaks (47 confirmed/fixed, 4 CVEs) in 3.4M lines of real C/C++ code, vastly outperforming baselines.
- Operates at an estimated cost of ~$1.70 per detected bug, proving a practical, cost-effective security tool.
Why It Matters
Provides a scalable, automated method to find critical security flaws in legacy systems, reducing manual review costs and improving code safety.