OrbiSim turns world models into differentiable physics engines for robotics
End-to-end differentiable simulation enables gradient-based policy training with sparse rewards.
OrbiSim, developed by researchers at Shanghai Jiao Tong University, introduces a novel paradigm for robotic simulation by redefining world models as fully differentiable physics engines. Unlike traditional world models that focus on unconstrained imagination in latent or visual domains, OrbiSim establishes a unified, physically-grounded pathway bridging structured scene assets, neural dynamics, and downstream reinforcement learning. The key innovation is end-to-end differentiability throughout the entire simulation loop—from explicit state transitions to visual observation generation. This allows the system to support tasks previously intractable with classical simulators, such as differentiable contact modeling, gradient-based policy optimization under sparse rewards, and intuitive physical inference.
Empirical results demonstrate that OrbiSim significantly outperforms state-of-the-art world models in both predictive fidelity and control performance. Its consistent responsiveness to asset configurations and physical parameters suggests potential as a differentiable tool for enhancing robot simulation and policy training. By enabling gradients to flow through physics, perception, and decision-making in a unified manner, OrbiSim opens new avenues for more efficient and robust embodied intelligence, particularly in scenarios where sparse rewards make traditional reinforcement learning difficult.
- End-to-end differentiability across the entire simulation loop, from state transitions to visual generation
- Outperforms state-of-the-art world models in predictive fidelity and control performance
- Enables tasks previously intractable for classical simulators, such as differentiable contact modeling and gradient-based policy optimization under sparse rewards
Why It Matters
A fully differentiable physics engine for robots could dramatically accelerate policy learning and simulation-based training.