Barak Or's review finds no complete runtime guardrail for physical AI
Robots and drones may appear confident while making disastrous physical decisions.
Physical AI systems—from autonomous vehicles to industrial drones—are increasingly mapping complex observations and language instructions into real-world actions. However, a new literature review by researcher Barak Or reveals a critical safety blind spot: these models can suffer from 'silent failures.' Unlike conventional AI content moderation errors, these failures occur when a black-box model appears confident, plausible, and semantically aligned but issues a physically consequential action that is fundamentally flawed. The causes range from sensor drift and occlusion to state-estimation errors and hallucinated affordances. Such failures can go undetected until downstream hardware controllers trigger a violation, by which point damage may already have occurred.
The review, published on arXiv and spanning 23 pages, surveys embodied foundation models, world models, safety benchmarks, safe control, and runtime assurance techniques. It finds that capability and safety have advanced along largely separate tracks across these fields, with no single stream supplying a complete runtime authorization boundary between the AI's decision and physical execution. To address this gap, Or formalizes the problem, defines 'silent physical-action failure,' and develops a taxonomy of runtime guardrail functions along with evaluation requirements for comparing them. The paper serves as a roadmap for designing verification layers that can authorize or block AI-generated actions before they reach motors, actuators, or vehicle controls.
- Physical AI failures can be 'silent'—models appear confident but actions are invalid due to sensor drift, occlusion, or state-estimation errors.
- No existing technique across embodied AI, safe control, or verification provides a complete runtime authorization boundary between black-box models and physical execution.
- The paper proposes a bounded problem formulation, a definition of silent failure, and a taxonomy for evaluating guardrails as assurance mechanisms.
Why It Matters
As autonomous robots and vehicles proliferate, invisible failures pose serious safety risks that require new guardrails.