Robot Planning and Situation Handling with Active Perception
Robots can now actively perceive and adapt to jamming doors and fallen objects.
Current robots can plan complex tasks but struggle when real-world surprises occur during execution—like a door jamming or an object falling. These situations arise from the robot's own action failures or external disturbances (e.g., human activity). Detecting and adapting to them in real time is a major barrier to long-term autonomy.
To address this, a team of researchers from multiple institutions introduced VAP-TAMP (Vision-Action Perception Task and Motion Planning). The framework uses action knowledge to strategically prompt vision-language models (VLMs) for active view selection—essentially telling the robot where to look next to assess the situation. It then builds and reasons over scene graphs to integrate task planning with motion planning. VAP-TAMP was evaluated on service tasks in simulation and on a physical mobile manipulation platform, demonstrating its ability to detect and recover from execution-time failures and external disturbances, bringing robots closer to reliable, long-duration operation.
- VAP-TAMP uses action knowledge to prompt vision-language models for active view selection and situation assessment.
- It constructs and reasons over scene graphs for integrated task and motion planning, handling jamming doors and fallen objects.
- Evaluated on simulation and a real mobile manipulation platform, showing ability to handle execution-time failures and external disturbances.
Why It Matters
VAP-TAMP bridges the gap between robot planning and real-world unpredictability, enabling long-term autonomy.