SCAR: New framework learns unified robot actions across embodiments
Robots can now share learned actions without needing separate data per body.
Deep Dive
SCAR is a self-supervised framework that learns continuous action representations from visual transitions across different robot embodiments. It uses an inverse-forward dynamics approach with a pretrained generative backbone, regularizing latent actions via a Gaussian prior and adversarial invariance to suppress embodiment-specific noise. Tested on Procgen and Robotwin, SCAR yields improved cross-embodiment low-data adaptation and cross-task transfer.
Key Points
- SCAR uses an inverse dynamics model (IDM) to infer latent actions from observation pairs, then a forward dynamics model (FDM) predicts future latents.
- A Gaussian prior regularization and adversarial invariance suppress embodiment-specific noise, making representations transferable.
- Tested on Procgen and Robotwin, SCAR improves cross-embodiment low-data adaptation and cross-task transfer over raw actions.
Why It Matters
Enables robots to transfer learned skills across different bodies, drastically reducing the need for embodiment-specific training data.