Robotics

Forecast-GS boosts robot pick-and-place with predictive 3D

Robots now forecast task-completed states from language cues

Deep Dive

Forecast-aware Gaussian Splatting (Forecast-GS), developed by Kaixin Jia and Jiacheng Xu, tackles a key limitation in language-conditioned robotic manipulation: the inability to reason about the task-completed state. While existing systems ground language into affordances or value maps, they only perceive the current scene, missing whether a candidate action leads to a feasible final state. Forecast-GS explicitly predicts the 3D scene after task completion from language instructions and partial observations, using Gaussian splatting for efficient, interpretable representation.

Validated on three real-world pick-and-place tasks—Cutter-to-Box, Apple-to-Bowl, Sponge-to-Tray—with 25 trials each, Forecast-GS with automatic candidate selection achieved success rates of 21/25, 23/25, and 16/25, consistently outperforming the ReKep baseline (15/25, 19/25, 10/25). A diagnostic human-assisted selection further improved results to 23/25, 24/25, and 19/25, indicating strong candidate generation but room for better automatic ranking. The work provides an interpretable bridge between language understanding, 3D perception, and manipulation planning.

Key Points
  • Forecast-GS predicts task-completed 3D states from language instructions, unlike systems that only reason over current scenes.
  • Achieved 84%/92%/64% success on Cutter-to-Box, Apple-to-Bowl, Sponge-to-Tray vs. baseline 60%/76%/40%.
  • Human-assisted selection boosted success to 92%/96%/76%, highlighting automatic ranking as a remaining challenge.

Why It Matters

Forecast-GS enables robots to reason about future states, crucial for reliable autonomous manipulation from language commands.