Agent Frameworks

VLM-CAD: VLM-Optimized Collaborative Agent Design Workflow for Analog Circuit Sizing

arXiv cs.MA March 25, 2026

⚡New workflow combines VLMs with symbolic reasoning to overcome AI's spatial blindness in engineering.

Deep Dive

A research team led by Guanyuan Pan has introduced VLM-CAD (Vision Language Model-Optimized Collaborative Agent Design Workflow), a novel AI system designed to tackle the notoriously difficult task of analog circuit sizing. The work addresses a critical weakness of current Vision Language Models (VLMs) like GPT-4V: their "spatial blindness" and tendency for logical hallucinations when interpreting dense, structured engineering schematics. VLM-CAD bridges this gap by deploying a multi-agent workflow where specialized modules handle different reasoning steps, anchored in deterministic facts rather than pure statistical inference.

At its core, VLM-CAD uses a neuro-symbolic parsing module called Image2Net, which transforms raw circuit diagram pixels into explicit topological graphs and structured JSON representations. This provides a factual, unambiguous foundation for the VLM to reason upon. To ensure reliability for high-stakes engineering decisions, the system employs ExTuRBO (Explainable Trust Region Bayesian Optimization), an advanced optimizer that uses agent-generated "semantic seeds" to warm-start searches and provides quantified evidence for every AI-generated design choice through Automatic Relevance Determination.

The experimental results, submitted to ACM Multimedia 2026, demonstrate significant gains. On two complex circuit benchmarks, VLM-CAD dramatically enhanced spatial reasoning accuracy while maintaining physics-based explainability—a must for engineering trust. Crucially, the AI-driven workflow consistently met complex performance specifications while optimizing for low power consumption, completing the entire design process in under 66 minutes. This represents a major step toward deploying robust, explainable multimodal AI in specialized technical domains where precision is non-negotiable.

Key Points

Integrates Image2Net, a neuro-symbolic parser that converts circuit schematics into topological graphs and JSON to ground VLM reasoning.
Uses ExTuRBO, an explainable Bayesian optimizer that provides quantified evidence for AI decisions via Automatic Relevance Determination.
Achieved high accuracy on complex benchmarks, satisfying specs with low power in under 66 minutes total runtime.

Why It Matters

Automates a complex, weeks-long engineering task in about an hour with explainable AI, potentially accelerating chip and hardware design.

Read Original Article

VLM-CAD: VLM-Optimized Collaborative Agent Design Workflow for Analog Circuit Sizing

Why It Matters

Stay Ahead in AI