Robotics

Embodied-R1.5: 8B model beats GPT-5.4 on 16 out of 24 robotics benchmarks

Tiny 8B-parameter model outperforms giants in physical world tasks

Deep Dive

Embodied-R1.5 is a new unified Embodied Foundation Model (EFM) developed by researchers from multiple institutions, including Tianjin University, Tsinghua University, and others. With only 8 billion parameters, it integrates comprehensive embodied reasoning capabilities—spanning cognition, task planning, correction, and pointing—within a single architecture. The model was trained on a massive dataset of over 15 billion tokens, built using three automated data construction pipelines. A multi-task balanced reinforcement learning recipe was designed to alleviate conflicts among heterogeneous tasks, allowing the model to excel across diverse physical intelligence challenges.

Embodied-R1.5 achieves state-of-the-art results on 16 out of 24 embodied VLM benchmarks, outperforming leading models like Gemini-Robotics-ER-1.5 and GPT-5.4. Its key innovation is the Planner-Grounder-Corrector (PGC) closed-loop framework, which enables the single model to autonomously execute and self-correct over long-horizon tasks. The model can be fine-tuned into a Vision-Language-Action (VLA) model with minimal data, surpassing established VLA models like π0.5 across four popular manipulation benchmark suites. Extensive zero-shot real-robot experiments validated strong generalization in instruction following, affordance grounding, articulated object manipulation, and complex long-horizon tasks. The team has open-sourced model weights, datasets, training code, and an evaluation framework called EmbodiedEvalKit.

Key Points
  • 8B-parameter model trained on 15B tokens using multi-task balanced RL to resolve task conflicts.
  • Achieves SOTA on 16/24 embodied VLM benchmarks, beating Gemini-Robotics-ER-1.5 and GPT-5.4.
  • Planner-Grounder-Corrector (PGC) loop enables self-correction; fine-tuned VLA outperforms π0.5 on 4 manipulation suites.

Why It Matters

Compact open-source model rivals top proprietary systems, accelerating embodied AI research and real-world robotics.

📬 Get the top 10 AI stories daily