ContractSkill: Repairable Contract-Based Skills for Multimodal Web Agents
New research shows explicit 'contracts' for AI skills triple success rates on complex web tasks.
A research team led by Zijian Lu, Yiping Zuo, and four others has introduced ContractSkill, a novel framework designed to solve a critical problem in multimodal web agents: the brittleness of on-demand generated skills. Current AI agents often create implicit, one-off procedures that fail when execution environments change slightly. ContractSkill formalizes these skills into explicit 'contracts' containing preconditions (what must be true to start), step-by-step specifications, postconditions (the desired outcome), recovery rules for errors, and termination checks. This structured representation transforms vague AI instructions into verifiable, repairable artifacts.
The impact is substantial. In experiments using models like GLM-4.6V and Qwen3.5-Plus on standard benchmarks, ContractSkill dramatically improved reliability. On the challenging VisualWebArena benchmark, it boosted success rates for self-generated skills from a baseline of 9.4% and 10.9% to 28.1% and 37.5%—effectively tripling performance. On MiniWoB, improvements were from 66.5% and 60.5% to 77.5% and 81.0%. Crucially, the repaired skill 'artifacts' are transferable across different AI models, improving a target model's performance by up to 47.8 percentage points without retraining, moving skill development from fragile generation to robust engineering.
- Framework converts draft AI skills into verifiable contracts with preconditions, postconditions, and recovery rules.
- Improved success rates from 9.4% to 28.1% on VisualWebArena and 66.5% to 77.5% on MiniWoB benchmarks.
- Repaired skills are transferable, boosting a different model's performance by up to 47.8 points without regeneration.
Why It Matters
It transforms AI agent development from fragile, one-off scripting to reliable, verifiable software engineering, enabling robust automation.