APISENSOR: Robust Discovery of Web API from Runtime Traffic Logs
New framework automatically maps undocumented APIs from noisy traffic, solving a major headache for AI agents.
A team of researchers has introduced APISENSOR, a novel framework designed to automatically discover and reconstruct web application programming interfaces (APIs) by analyzing runtime network traffic. This addresses a critical challenge in modern software engineering and AI agent development: as large language model (LLM)-based agents increasingly rely on APIs to interact with web applications, they are often hindered by incomplete, outdated, or inconsistent documentation. Existing methods are either static, requiring access to internal source code that is unavailable for closed-source systems, or dynamic but fragile, failing when traffic from multiple applications is mixed together at shared collection points. APISENSOR takes a robust, unsupervised black-box approach, making it broadly applicable to real-world scenarios.
APISENSOR works by performing structured analysis on complex traffic logs. Its core innovation is a two-stage process that first denoises and normalizes the captured traffic, then applies a graph-based clustering algorithm to group related requests and accurately infer the underlying API structure. The team rigorously evaluated the system across six different web applications using a dataset of over 10,000 runtime requests injected with simulated mixed-traffic noise. Results showed APISENSOR significantly outperformed ten state-of-the-art baseline methods, achieving an average Group Accuracy Precision of 95.92% and an F1-score of 94.91%. Crucially, it demonstrated superior robustness, with the lowest performance variance and a maximum performance drop of only 8.11 points under noisy conditions. In a practical test, APISENSOR successfully identified inconsistencies in the official documentation of a real application, which were later confirmed by its developers.
- Achieves 95.92% precision in discovering API endpoints from over 10,000 noisy runtime requests.
- Uses a novel two-stage process combining traffic denoising with graph-based clustering for robustness.
- Outperformed 10 existing methods and identified real documentation bugs in a live application.
Why It Matters
Enables AI agents and developers to autonomously interact with evolving, poorly documented web services, unlocking automation for countless closed-source applications.