Systematic API Testing Through Model Checking and Executable Contracts
Researchers combine formal verification with executable contracts to solve the API test-oracle problem.
A team of researchers has introduced IcePick, a novel framework that applies formal verification techniques to the persistent challenge of automated API testing. Traditional black-box API testing often suffers from a semantic gap, where interface specs define operations but lack behavioral details, leading to the test-oracle problem and ineffective stateful test sequences. IcePick bridges this gap by using the TLA+ formal specification language to model API state evolution. It then employs the TLC model checker to exhaustively explore the reachable state space, generating test sequences that provably cover the behavioral model. To manage the classic state-space explosion problem, the team developed a coverage-guided breadth-first traversal of TLC's state graph.
To address the oracle problem beyond simple HTTP status codes, the researchers also created Glacier, a first-order logic contract language. Glacier enriches API specifications with executable semantic contracts, enabling automated behavioral verification during test execution. The system was evaluated on the EvoMaster Benchmark, where it demonstrated the ability to achieve complete state coverage and uncover faults in complex, multi-operation interactions that are typically missed. The paper includes a scalability analysis to define the practical limits and requirements for applying IcePick to real-world, critical API-based systems, offering a path toward more reliable and reproducible test suites with strong coverage guarantees.
- IcePick uses TLA+ modeling and TLC model checking to generate API test sequences with provable state-space coverage.
- The Glacier contract language adds executable first-order logic contracts to APIs for automated behavioral verification.
- Evaluation on EvoMaster Benchmark shows IcePick finds faults in multi-operation interactions missed by traditional methods.
Why It Matters
This approach could significantly improve the reliability of critical microservices and cloud APIs by automating deep, stateful testing.