PGDG: Single-demo bimanual robot training hits 93% success
Robots learn complex two-handed tasks from just one demonstration, boosting success by 50%+.
Behavior cloning for contact-rich bimanual manipulation typically requires costly, diverse demonstrations. A single demo often leads to fragile policies that fail under small disturbances, as the system enters off-manifold states without recovery supervision. PGDG solves this by starting with one human demonstration and automatically generating a compact dataset of physically plausible, successful, and diverse recovery behaviors—no extra labeling required.
PGDG iteratively balances a physics-grounded sampler and a dataset curator. The curator identifies under-covered recovery modes and updates the sampling distribution, while the sampler generates physically plausible rollouts. Short-horizon sampling replaces risky states with corrective actions, enhancing data quality. On RotateBox-Pitch, PGDG jumped simulation success from 38% to 93% and real-world from 35% to 82%. When used to fine-tune the GR00T foundation model, success improved from 46% to 77%. This enables robust bimanual policies from minimal human effort.
- PGDG generates a diverse, physically plausible dataset from a single demonstration without human labeling.
- Simulation success on RotateBox-Pitch improved from 38% to 93%; real-world transfer went from 35% to 82%.
- Fine-tuning GR00T with PGDG data lifted task success from 46% to 77%.
Why It Matters
Cuts data collection costs for bimanual robot training, enabling scalable, robust manipulation in real-world settings.