DEHP: Dynamic Horizon Prediction Boosts Robot Policy Success Rates
Chunk-based robot policies get smarter with a learned, adaptive execution horizon.
Action chunking has become a standard design in modern robot policies, from diffusion/flow policies to vision-language-action models. Instead of acting one step at a time, the policy predicts a sequence of actions and executes a fixed number of them. However, this fixed execution horizon forces the policy to operate open-loop during chunk execution, which is especially problematic for fine-grained manipulation tasks that require frequent replanning. Typically, the execution horizon is chosen through empirical tuning and is highly task-dependent.
To address this, the team proposes Dynamic Execution Horizon Prediction (DEHP). DEHP trains a lightweight execution-horizon prediction branch using online reinforcement learning while keeping the entire pretrained chunk policy completely frozen. This makes the method compatible with black-box chunk policies and isolates the effect of adapting the execution horizon from changes to the underlying action generator. Across evaluations, DEHP improved success rates on various high-precision and long-horizon manipulation tasks by a large margin. Qualitative analysis shows DEHP predicts shorter horizons during fine-grained stages and longer horizons during free-space motion, effectively balancing open-loop efficiency with closed-loop reactivity.
- DEHP trains a lightweight horizon prediction branch via online RL while keeping the pretrained chunk policy frozen, compatible with black-box policies.
- Dynamically adjusts execution horizon: shorter for fine-grained manipulation, longer for free-space motion, improving both reactivity and efficiency.
- Achieves large margin improvements in success rates on high-precision and long-horizon robot manipulation tasks.
Why It Matters
Adaptive execution horizons make robot policies more effective for real-world tasks requiring both precision and efficiency.