See Something, Say Something: Context-Criticality-Aware Mobile Robot Communication for Hazard Mitigations
A new framework uses VLMs to let robots decide if a knife is a kitchen tool or an urgent threat.
A team of researchers including Bhavya Oza and Devam Shah has published a novel framework titled "See Something, Say Something" for autonomous mobile robots (AMRs). The core innovation is a context-criticality-aware communication system that moves beyond simple object detection. Instead of triggering a uniform alert for every detected hazard, the framework uses Vision-Language Models (VLMs) or Large Language Models (LLMs) to perform a structured assessment. This assessment evaluates the criticality level, time sensitivity, and feasibility of mitigation based on the object's context. For example, a knife spotted in a kitchen generates a calm acknowledgment, while the same object found in a school corridor triggers an urgent, coordinated alert to security.
This dynamic, context-sensitive approach was validated in over 60 real-world runs using a patrolling mobile robot. The results demonstrated a dual benefit: a measurable reduction in the time to action for genuine threats and a significant boost in user trust, which reached 82%. This trust level notably surpassed that of traditional fixed-priority alert systems, which often cause alarm fatigue by over-flagging benign objects. The research argues that for robots to be effective partners in safety-critical environments—from warehouses to hospitals—they must not just see hazards, but intelligently communicate their severity based on the surrounding situation, thereby enabling faster and more effective human or automated responses.
- Uses VLM/LLM-based perception to assess hazard context, not just presence (e.g., knife in kitchen vs. corridor).
- Validated in 60+ real-world runs, increasing user trust to 82% compared to simpler fixed-alert systems.
- Framework evaluates criticality, time sensitivity, and mitigation feasibility to generate adaptive, appropriate alerts.
Why It Matters
Enables safer, more trustworthy human-robot collaboration in dynamic environments by preventing alarm fatigue and accelerating genuine emergency responses.