Mixed-Initiative Dialog for Human-Robot Collaborative Manipulation
The system improved task success in trials with 18 participants, beating pure LLM baselines.
A research team from UT Austin and Stanford has published a paper introducing MICoBot, a novel system applying a Mixed-Initiative Dialog paradigm to human-robot collaborative manipulation. The core innovation is enabling both the human and robot to proactively propose, accept, or reject task steps using natural language, creating a flexible communication loop. This addresses a critical gap in long-horizon collaboration where human partners' behavior, willingness to assist, and understanding of robot capabilities can vary. The system moves beyond traditional master-slave or rigidly scripted interactions toward a more adaptive, conversational partnership for completing complex physical tasks.
MICoBot operates through a three-tiered architecture: a meta-planner that interprets dialog to code a high-level strategy, a planner that optimally allocates remaining steps based on a simulation-pretrained affordance model and estimated human availability, and an action executor that handles low-level actions or speech. This structure allows it to find collaborative strategies that minimize human effort while maximizing the use of the robot's capabilities. In physical robot trials involving 18 unique participants, MICoBot demonstrated significantly higher task success rates and better user experience compared to a pure Large Language Model baseline and standard agent allocation models, proving the practical value of its negotiated, language-based approach to teamwork.
- Uses a three-level decision architecture (meta-planner, planner, action executor) to handle task negotiation via natural language.
- Tested in physical trials with 18 human participants, showing significant improvements over pure LLM baselines.
- Leverages a simulation-pretrained affordance model to assess robot capabilities and human availability for optimal task allocation.
Why It Matters
Enables more natural, efficient, and adaptive teamwork between humans and robots for complex physical tasks in warehouses, labs, or homes.