Research & Papers

CADSmith: Multi-Agent CAD Generation with Programmatic Geometric Validation

The system uses two correction loops and programmatic validation to slash geometric errors by 97%.

Deep Dive

A research team from Carnegie Mellon University has introduced CADSmith, a novel multi-agent system that transforms natural language descriptions into precise, executable CAD models. Unlike previous single-pass methods that often produce geometrically flawed outputs, CADSmith employs a sophisticated two-loop refinement process. The inner loop fixes code execution errors, while the outer loop performs programmatic geometric validation. This validation combines exact numerical measurements from the OpenCASCADE kernel—such as bounding box dimensions and volume—with holistic visual assessment from an independent vision-language model called Judge. This dual approach provides both the millimeter-level precision and the high-level shape awareness required for accurate CAD generation.

Crucially, CADSmith uses retrieval-augmented generation (RAG) over API documentation instead of fine-tuning, allowing it to stay current as underlying CAD libraries evolve. The team evaluated the system on a custom benchmark of 100 prompts across three difficulty tiers. The results are striking: CADSmith achieved a 100% execution rate (up from 95% for a baseline), improved the median Intersection over Union (IoU) score from 0.8085 to 0.9629, and slashed the mean Chamfer Distance—a key metric for geometric accuracy—from 28.37 to just 0.74. This represents a 97% reduction in geometric error, demonstrating that closed-loop refinement with programmatic feedback is a game-changer for reliability.

The work, detailed in an arXiv preprint, addresses a core weakness in text-to-CAD generation: the lack of verifiable geometric grounding. By integrating formal CAD kernel validation directly into the AI's correction cycle, CADSmith ensures that the final model is not just syntactically correct code but a dimensionally accurate and manufacturable solid. This pipeline marks a significant step toward trustworthy AI assistants for engineering and design, where precision is non-negotiable.

Key Points
  • Uses a two-loop correction system: an inner loop for code execution and an outer loop for programmatic geometric validation with OpenCASCADE and a VLM.
  • Achieved a 100% execution rate and reduced mean Chamfer Distance error by 97% (from 28.37 to 0.74) on a 100-prompt benchmark.
  • Employs retrieval-augmented generation (RAG) over API docs instead of fine-tuning, ensuring the system stays updated with evolving CAD libraries.

Why It Matters

This brings reliable, production-ready AI to mechanical engineering and manufacturing, where dimensional accuracy is critical for prototyping and parts.