End-to-end differentiability across the entire simulation loop, from state transitions to visual generation?

End-to-end differentiability across the entire simulation loop, from state transitions to visual generation

Outperforms state-of-the-art world models in predictive fidelity and control performance?

Outperforms state-of-the-art world models in predictive fidelity and control performance

Enables tasks previously intractable for classical simulators, such as differentiable contact modeling and gradient-based policy optimization under sparse rewards?

Enables tasks previously intractable for classical simulators, such as differentiable contact modeling and gradient-based policy optimization under sparse rewards

Robotics

OrbiSim turns world models into differentiable physics engines for robotics

arXiv cs.RO May 19, 2026

⚡End-to-end differentiable simulation enables gradient-based policy training with sparse rewards.

Deep Dive

OrbiSim, developed by researchers at Shanghai Jiao Tong University, introduces a novel paradigm for robotic simulation by redefining world models as fully differentiable physics engines. Unlike traditional world models that focus on unconstrained imagination in latent or visual domains, OrbiSim establishes a unified, physically-grounded pathway bridging structured scene assets, neural dynamics, and downstream reinforcement learning. The key innovation is end-to-end differentiability throughout the entire simulation loop—from explicit state transitions to visual observation generation. This allows the system to support tasks previously intractable with classical simulators, such as differentiable contact modeling, gradient-based policy optimization under sparse rewards, and intuitive physical inference.

Empirical results demonstrate that OrbiSim significantly outperforms state-of-the-art world models in both predictive fidelity and control performance. Its consistent responsiveness to asset configurations and physical parameters suggests potential as a differentiable tool for enhancing robot simulation and policy training. By enabling gradients to flow through physics, perception, and decision-making in a unified manner, OrbiSim opens new avenues for more efficient and robust embodied intelligence, particularly in scenarios where sparse rewards make traditional reinforcement learning difficult.

Key Points

End-to-end differentiability across the entire simulation loop, from state transitions to visual generation
Outperforms state-of-the-art world models in predictive fidelity and control performance
Enables tasks previously intractable for classical simulators, such as differentiable contact modeling and gradient-based policy optimization under sparse rewards

Why It Matters

A fully differentiable physics engine for robots could dramatically accelerate policy learning and simulation-based training.

Read Original Article

OrbiSim turns world models into differentiable physics engines for robotics

Why It Matters

Related Articles

🚀 Stay Ahead in AI