ChipMATE: Multi-agent RL framework achieves 80% RTL generation accuracy with 9B parameters
Outperforms DeepSeek V4 (1600B) using models 200x smaller and no golden testbench.
ChipMATE tackles a fundamental misalignment in existing AI-driven RTL generation: current API-based agents assume a golden testbench is available at generation time, rely on closed-source APIs incompatible with chip vendors' air-gapped security requirements, and cannot be trained on proprietary RTL codebases. To fix this, the team introduces a multi-agent system where a Verilog agent and a Python reference-model agent work together, cross-verifying each other's outputs without any oracle. The framework uses a backtrack-based inference workflow to prevent error propagation across turns, and a two-stage training pipeline that first saturates each agent's individual code-generation capability via reinforcement learning, then jointly trains them to collaborate effectively.
To support training, the authors built a hybrid data-generation pipeline producing 64.4K high-quality reference model samples. The results are striking: ChipMATE with a 4B base model achieves 75.0% pass@1 on VerilogEval V2, while a 9B version reaches 80.1%—surpassing all existing self-trained models and even DeepSeek V4 (1600B) in a head-to-head comparison. The framework is entirely open-source, with code and model weights publicly available, enabling semiconductor companies to deploy and fine-tune RTL generation models on their own secure infrastructure while leveraging proprietary design data for the first time.
- ChipMATE pairs a Verilog agent with a Python reference-model agent for mutual verification without a golden testbench, mimicking industrial cross-comparison practices.
- Achieves 75.0% (4B) and 80.1% (9B) pass@1 on VerilogEval V2, outperforming DeepSeek V4 (1600B) with models 200x smaller.
- Self-trained and fully open-source, enabling air-gapped deployment and fine-tuning on proprietary RTL codebases.
Why It Matters
ChipMATE democratizes AI-assisted chip design for security-conscious semiconductor vendors by enabling self-hosted, oracle-free RTL generation at state-of-the-art accuracy.