Replicated SPARK benchmark case G1SportMode_D1_WG_SO_v1 in MuJoCo for controlled adversarial stress testing?

Replicated SPARK benchmark case G1SportMode_D1_WG_SO_v1 in MuJoCo for controlled adversarial stress testing

Tested six safety filters (RSSA, RSSS, SSA, CBF, PFM, SMA) under obstacle crowding, noisy distance, and delayed obstacle info?

Tested six safety filters (RSSA, RSSS, SSA, CBF, PFM, SMA) under obstacle crowding, noisy distance, and delayed obstacle info

Safety behavior changed significantly under stress, with no single filter performing reliably across all conditions?

Safety behavior changed significantly under stress, with no single filter performing reliably across all conditions

Robotics

Stress tests reveal SPARK humanoid safety filters fail under real-world chaos

arXiv cs.RO May 20, 2026

⚡Researchers expose critical flaws in humanoid robot safety filters during adversarial stress testing

Deep Dive

A team of researchers led by Saurav Ghosh replicated the SPARK benchmark case G1SportMode_D1_WG_SO_v1 in the MuJoCo physics simulator to evaluate the robustness of six safety filters designed for humanoid robots: RSSA, RSSS, SSA, CBF, PFM, and SMA. These filters are meant to modify control actions that might violate collision-avoidance constraints, but nominal benchmark scores often hide weaknesses in harder environments. The team built a post-processing pipeline to convert raw SPARK logs into goal-tracking, minimum-distance, and collision-step metrics, providing a more detailed view of each filter's performance under controlled random seeds.

Stress tests introduced obstacle crowding, noisy distance estimates, and delayed obstacle information—conditions typical of real-world humanoid deployment. Results showed that no single filter excelled everywhere: some tracked the goal more closely while others reduced collision steps more effectively. Critically, safety behavior changed under stress, revealing failure modes that nominal benchmarks miss. The findings underscore that humanoid autonomy must be evaluated beyond standard benchmarks with metrics that expose weaknesses before deployment. This work provides a replicable methodology for stress-testing safety filters in high-dimensional, collision-prone environments.

Key Points

Replicated SPARK benchmark case G1SportMode_D1_WG_SO_v1 in MuJoCo for controlled adversarial stress testing
Tested six safety filters (RSSA, RSSS, SSA, CBF, PFM, SMA) under obstacle crowding, noisy distance, and delayed obstacle info
Safety behavior changed significantly under stress, with no single filter performing reliably across all conditions

Why It Matters

Humanoid robots need stress testing before real-world deployment to avoid dangerous safety filter failures.

Read Original Article

Stress tests reveal SPARK humanoid safety filters fail under real-world chaos

Why It Matters

Related Articles

🚀 Stay Ahead in AI