Robotics

RL agent designs optimal experiments for mechatronic systems with 0.75% safety violations

An RL agent autonomously crafts excitation signals, beating expert-designed methods on a Quanser Aero 2.

Deep Dive

Classical system identification for mechatronic systems relies on expert knowledge to manually design excitation signals that respect hardware safety limits—a time-consuming and non-generalizable process. In this paper, Julian Langschwert and colleagues introduce a reinforcement learning (RL) agent that learns to generate optimal excitation signals autonomously. The agent is trained on a Quanser Aero 2 testbed and uses reward shaping to enforce safety constraints during signal generation, eliminating the need for human trial-and-error.

Evaluated across 10 independent training seeds, the RL approach consistently matches or exceeds the accuracy of classical baselines across all three identified parameters. Crucially, safety violations occur only 0.75% of the time—a fraction that conventional methods struggle to achieve without overly conservative designs. The work, accepted at DEXA AI4IP 2026, demonstrates that RL can replace manual experiment design for parameter identification, paving the way for fully autonomous system calibration in robotics and mechatronics.

Key Points
  • RL agent autonomously designs excitation signals without expert knowledge or hand-crafted constraints.
  • Achieves competitive parameter estimation accuracy across 3 identified parameters on Quanser Aero 2.
  • Safety violations limited to just 0.75%, outperforming classical baselines in both accuracy and efficiency.

Why It Matters

Automating experiment design saves time and improves safety for calibrating robots and mechatronic systems.