Assistax benchmark accelerates assistive robotics RL with 370x speedup
JAX-powered benchmark trains robot agents to assist humans faster than ever
A team of researchers from the University of Edinburgh and other institutions has released Assistax, a new benchmark designed to accelerate reinforcement learning (RL) for assistive robotics. Unlike traditional RL benchmarks that rely on games like Atari or Go, Assistax focuses on embodied interaction scenarios where a robot must assist a simulated human patient. The key innovation is using JAX's hardware acceleration for physics-based simulations, achieving up to 370x faster wall-clock time compared to CPU-based alternatives when vectorizing training runs. This allows researchers to iterate on robot policies orders of magnitude faster than before.
Assistax frames the assistive task as a multi-agent RL problem, training a population of diverse human partner agents and testing a robotic agent's ability to coordinate with unseen partners (zero-shot coordination). The benchmark includes extensive hyperparameter tuning and baselines for popular continuous control algorithms. By providing a standardized, open-source platform accepted at the Reinforcement Learning Conference 2026, Assistax aims to push RL research toward real-world applications like robotic caregiving and rehabilitation, where adaptive human-robot collaboration is critical.
- Uses JAX hardware acceleration to achieve 370x speedup over CPU-based RL training in physics simulations
- Multi-agent setup trains diverse human partner agents to test zero-shot coordination in assistive tasks
- Accepted at Reinforcement Learning Conference 2026; code and baselines are open-source
Why It Matters
Assistax slashes training time for assistive robot policies, accelerating progress toward real-world human-robot interaction.