Robotics

E$^2$DT: Efficient and Effective Decision Transformer with Experience-Aware Sampling for Robotic Manipulation

Robots learn 2x faster by cherry-picking the most valuable training moments.

Deep Dive

In robotic manipulation, Decision Transformers (DTs) struggle with sample efficiency because they rely on uniform replay of experiences, lacking active exploration. The new E²DT framework, presented by researchers at ICRA 2026, solves this by letting the model shape its own training data. It uses a k-Determinantal Point Process (k-DPP) sampling mechanism that selects the most informative trajectory windows. Quality is measured via a composite metric combining return-to-go (RTG) quantiles, predictive uncertainty, and inverse-frequency stage coverage. Diversity is assessed through the DT's internal latent embeddings. This quality-diversity joint kernel ensures the robot learns from a balanced set of experiences—neither overfitting to common paths nor wasting time on irrelevant ones.

E²DT was evaluated on challenging manipulation benchmarks in both simulated environments and real-robot setups. Results consistently showed improvements over baseline DTs and prior RL methods, particularly in long-horizon tasks where exploration and sample efficiency are critical. The method avoids both local optima from excessive exploration and inefficient convergence from under-exploration. By coupling policy learning with intelligent experience selection, E²DT offers a principled path toward robust, data-efficient robotic learning—a key step for deploying manipulation skills in real-world settings like manufacturing and logistics.

Key Points
  • E²DT uses a k-Determinantal Point Process to actively select experience windows based on quality (RTG, uncertainty, coverage) and diversity (latent embeddings).
  • The composite quality metric integrates return-to-go quantiles, predictive uncertainty, and inverse frequency for stage coverage.
  • Accepted at ICRA 2026, E²DT outperforms prior methods on both simulation and real-robot manipulation benchmarks.

Why It Matters

More efficient robot learning means faster adaptation to new tasks, reducing training time and cost in industrial automation.