OmniSch: A Multimodal PCB Schematic Benchmark For Structured Diagram Visual Reasoning
New benchmark reveals major gaps in how models like GPT-4o and Claude 3.5 interpret complex engineering diagrams.
A consortium of researchers, led by Taiting Lu and 13 other authors, has published OmniSch, a groundbreaking benchmark designed to rigorously test the capabilities of Large Multimodal Models (LMMs) like GPT-4o and Claude 3.5 in a specialized domain: interpreting Printed Circuit Board (PCB) schematic diagrams. This is the first comprehensive benchmark for this task, containing 1,854 real-world schematic images. It moves beyond simple image captioning to assess a model's ability to perform the core engineering function of converting a visual diagram into a machine-readable, spatially weighted netlist graph—the backbone of Electronic Design Automation (EDA) tools.
The benchmark comprises four distinct tasks that escalate in complexity: Visual Grounding (aligning 423.4K semantic labels to visual regions), Diagram-to-Graph reasoning (understanding topological connections), Geometric reasoning (assigning layout-dependent weights), and Tool-Augmented Agentic reasoning (using external tools to solve the previous tasks). The team's evaluation revealed substantial and concerning gaps in current state-of-the-art models. These include unreliable fine-grained grounding of components, brittle parsing of layout into graph structure, inconsistent reasoning about global connectivity, and inefficient visual exploration. Essentially, while LMMs excel at general vision tasks, they struggle with the precise, structured reasoning required for technical engineering artifacts.
This work is significant because it provides a concrete, quantitative measure of AI's readiness for real-world engineering assistance. It shifts the conversation from whether models can 'see' a schematic to whether they can correctly 'understand' its function and connectivity with the accuracy demanded by hardware designers. The benchmark's release will likely drive focused improvements in multimodal reasoning, pushing developers to create models that are not just broadly capable but technically reliable for specialized professional use.
- Contains 1,854 real-world PCB schematic diagrams and 109.9K grounded component instances for testing.
- Evaluates models on four core engineering tasks, including spatial netlist graph construction critical for EDA workflows.
- Reveals major reliability gaps in current LMMs, including brittle layout parsing and inconsistent connectivity reasoning.
Why It Matters
It provides the first rigorous test for AI's ability to assist in hardware design, a multi-billion dollar industry reliant on precise diagram interpretation.