Research & Papers

CAX-Agent achieves 92.7% completion rate in MAPDL simulations

New agent harness with model-only recovery triples zero-intervention rate to 84%.

Deep Dive

Large language models deployed for MAPDL finite-element simulations often suffer from reliability issues like inconsistent outputs and frequent task failures without structured execution control. CAX-Agent, developed by Chenying Lin and colleagues, addresses this by introducing a lightweight agent harness that inserts domain-specific orchestration middleware. Its architecture organizes execution into three layers: an LLM service, an agent harness managing tool lifecycles and workflow state, and a solver backend. A key innovation is its recovery ladder, which escalates from deterministic rule patching through model-driven regeneration to context enrichment and eventual human intervention, ensuring robustness.

In rigorous evaluation across 50 standard structural benchmarks with 450 case-runs (three runs per strategy), the model-only recovery strategy dominated. It achieved a task completion rate of 0.9267, a task score of 3.59 out of 4, a total score of 9.16 out of 10, and an impressive zero-intervention rate of 0.84. In contrast, rule-only reached 0.7733 completion and zero zero-intervention, while no-recovery lagged at 0.6933. Inter-rater agreement was strong (Cohen's kappa = 0.84), and effect sizes were large (Cliff's delta 0.81–0.87). These results demonstrate that model-only recovery can nearly eliminate the need for human oversight in routine simulations, paving the way for reliable, scalable AI-driven engineering automation.

Key Points
  • Model-only recovery achieved 92.7% task completion vs 77.3% for rule-only and 69.3% for no-recovery.
  • Zero-intervention rate of 84% with model-only, meaning tasks rarely needed human help.
  • Strong inter-rater agreement (Cohen's kappa = 0.84) and large effect sizes (Cliff's delta 0.81–0.87).

Why It Matters

Enables reliable AI-driven engineering simulation automation, reducing human oversight by 84%.