Uni-LaViRA achieves zero-shot navigation across four robot types (wheeled, quadruped, humanoid, UAV) and four task families, requiring no training data?

Uni-LaViRA achieves zero-shot navigation across four robot types (wheeled, quadruped, humanoid, UAV) and four task families, requiring no training data.

77.7% SR on HM3D-v2, 60.7% on VLN-CE R2R, and 40.0% on OpenUAV – matching trained models.

Two agent-loop mechanisms – TODO List Memory (TDM) for sub-goal tracking and Second Chance Backtrack (SCB) for error recovery – make unified zero-shot navigation practical?

Two agent-loop mechanisms – TODO List Memory (TDM) for sub-goal tracking and Second Chance Backtrack (SCB) for error recovery – make unified zero-shot navigation practical.

Robotics

Uni-LaViRA navigates robots zero-shot across wheeled, quadruped, humanoid, and UAV

arXiv cs.RO May 28, 2026

⚡A single architecture controls four robot types on four navigation tasks without any training.

Deep Dive

Uni-LaViRA rethinks embodied navigation as a translation problem: language provides semantic directional commands and vision provides pixel-level targets, both handled natively by pretrained multimodal LLMs. This structural insight eliminates the need for robot-specific training data. The architecture extends this to four distinct robot platforms (wheeled, quadruped, humanoid, self-built UAV) and four task families (VLN-CE, ObjectNav, EQA, Aerial-VLN) with zero additional training. Two innovations make this practical: TODO List Memory (TDM) maintains a structured checklist of sub-goals, re-injecting unfinished items into the agent’s attention window at each step, and Second Chance Backtrack (SCB) rolls the robot back to a pre-error state, turning navigation into a self-correcting loop.

Benchmark results are striking: 60.7% success rate (SR) on VLN-CE R2R, 51.3% on RxR, 77.7% on HM3D-v2, 60.0% on HM3D-OVON, 54.7% on MP3D-EQA, and 40.0% on OpenUAV. These numbers rival or surpass foundation models trained on millions of robot trajectories and thousands of GPU-hours. For professionals, this means any organization can deploy a single navigation controller across heterogeneous robot fleets without collecting or labeling new data – a significant step toward plug-and-play robotics.

Key Points

Uni-LaViRA achieves zero-shot navigation across four robot types (wheeled, quadruped, humanoid, UAV) and four task families, requiring no training data.
Key benchmarks: 77.7% SR on HM3D-v2, 60.7% on VLN-CE R2R, and 40.0% on OpenUAV – matching trained models.
Two agent-loop mechanisms – TODO List Memory (TDM) for sub-goal tracking and Second Chance Backtrack (SCB) for error recovery – make unified zero-shot navigation practical.

Why It Matters

Enables any robot fleet to navigate new environments without costly training data or fine-tuning.

Read Original Article

Uni-LaViRA navigates robots zero-shot across wheeled, quadruped, humanoid, and UAV

Why It Matters

Related Articles

🚀 Stay Ahead in AI