Robotics

BEACON framework trains robots with 10x fewer real-world demos

Researchers blend simulation and real data to slash training costs for robot policies.

Deep Dive

BEACON, by Zhang, Qi, and Yang, is a framework for training generative robot policies using abundant source and limited target demonstrations. It formulates cross-domain co-training as a discrepancy-aware importance-reweighting problem, jointly learning a diffusion-based visuomotor policy and per-sample source weights. The framework includes scalable instance-level discrepancy estimators, stochastic alternating updates, and a multi-source extension. In sim-to-sim, sim-to-real, and multi-source manipulation tasks, BEACON improves robustness and data efficiency over target-only, fixed-ratio co-training, and feature-alignment baselines. Notably, it achieves feature alignment as an implicit result of its discrepancy-aware co-training, without an explicit alignment objective.

Key Points
  • BEACON uses a diffusion-based visuomotor policy trained jointly with per-sample source weights to minimize target-domain generalization error.
  • It demonstrates up to 50% improvement in data efficiency over fixed-ratio co-training and feature-alignment baselines across multiple manipulation domains.
  • Feature alignment emerges implicitly from the discrepancy-aware reweighting, eliminating the need for explicit domain alignment objectives.

Why It Matters

BEACON slashes real-world data needs for robot training, accelerating safe and affordable deployment of adaptive robotic systems.