Learning Physical Principles from Interaction: Self-Evolving Planning via Test-Time Memory
New memory framework boosts robot manipulation success from 23% to 76% by learning physical principles through interaction.
A research team from Stanford University and UC San Diego has introduced PhysMem, a novel memory framework that enables vision-language model (VLM) robot planners to learn physical principles through direct interaction during operation, without requiring traditional model retraining. The system addresses a critical limitation where current VLM planners can reason about general concepts like friction and stability but struggle to predict specific outcomes—like how a particular ball will roll on a specific surface—without prior experience. PhysMem operates by recording interaction experiences, generating candidate hypotheses about physical properties, and crucially, verifying these hypotheses through targeted testing before promoting validated knowledge to guide future decisions.
This 'verification before application' design reduces rigid reliance on prior experience when physical conditions change, allowing robots to adapt dynamically. The framework was evaluated on three real-world manipulation tasks and simulation benchmarks across four different VLM backbones. In a controlled brick insertion task, PhysMem's principled abstraction achieved a 76% success rate, dramatically outperforming the 23% success of direct experience retrieval methods. Real-world experiments demonstrated consistent performance improvements over 30-minute deployment sessions, showing the system's ability to self-evolve its planning capabilities through test-time learning. This approach represents a significant step toward more adaptive and robust robotic systems that can operate reliably in unstructured environments.
- PhysMem framework boosts robot manipulation success from 23% to 76% on brick insertion tasks
- Enables test-time learning without updating model parameters through experience recording and hypothesis verification
- Uses 'verification before application' design to adapt to changing physical conditions in real-time
Why It Matters
Enables robots to learn and adapt to real-world physics dynamically, making them more reliable for complex manipulation tasks in unstructured environments.