Ambig-IaC: Multi-level Disambiguation for Interactive Cloud Infrastructure-as-Code Synthesis
New training-free method tackles ambiguous cloud requests, improving IaC generation by 18-25% over baselines.
A research team including Zhenning Yang, Kaden Gruizenga, and Tongyuan Miao has developed Ambig-IaC, a novel framework designed to solve a critical bottleneck in AI-assisted cloud management. The core problem is that user requests for Infrastructure-as-Code (IaC) configurations—like "set up a web server"—are inherently ambiguous, and unlike regular code, cloud configurations are too expensive to deploy and test iteratively. The researchers observed that ambiguity in IaC has a structured, hierarchical nature, decomposing into three axes: resources (what components), topology (how they connect), and attributes (their specific settings). Higher-level choices constrain lower-level ones, making the problem tractable.
Ambig-IaC's training-free, disagreement-driven approach works by first generating multiple candidate IaC specifications from a single ambiguous prompt using a large language model (LLM). It then analyzes these candidates to identify structural disagreements across the three axes, ranks these points of conflict by their informativeness, and formulates precise, targeted clarification questions for the user. This interactive process progressively narrows the configuration space toward the correct setup. To validate their method, the team created a benchmark of 300 validated IaC tasks with intentionally ambiguous prompts and an evaluation framework using graph edit distance and embedding similarity. The results are significant: Ambig-IaC outperformed the strongest baseline with relative improvements of +18.4% on structural accuracy and +25.4% on attribute accuracy, moving the needle for reliable, one-shot cloud infrastructure synthesis from natural language.
- Framework identifies ambiguity across three hierarchical axes: resources, topology, and attributes, where high-level decisions constrain lower-level ones.
- Uses a training-free, disagreement-driven method to generate diverse specs and ask targeted questions, improving structure accuracy by 18.4% and attribute accuracy by 25.4%.
- Introduces a new benchmark of 300 validated IaC tasks to evaluate performance on ambiguous cloud deployment prompts.
Why It Matters
Enables more reliable, one-shot generation of complex cloud infrastructure from vague natural language requests, reducing costly deployment errors.