Robotics

HMR-1: Hierarchical Massage Robot with Vision-Language-Model for Embodied Healthcare

Researchers unveil a hierarchical robot framework that grounds acupoints and plans massage trajectories using MLLMs.

Deep Dive

A research team from multiple institutions has published a paper on HMR-1 (Hierarchical Massage Robot-1), a novel framework that combines robotics with multimodal AI for embodied healthcare. The system addresses two major gaps in the field: the lack of standardized benchmarks and the scarcity of open-source datasets for acupoint massage. To solve this, the researchers created MedMassage-12K—a comprehensive multimodal dataset containing 12,190 images with 174,177 question-answer pairs covering diverse lighting conditions and backgrounds. This dataset serves as a foundational resource for training and evaluating AI models in therapeutic contexts.

The HMR-1 framework operates through a two-tiered architecture. The high-level module uses multimodal large language models (MLLMs) to interpret natural language instructions and visually identify acupoint locations on the human body. The low-level control module then translates these identified points into precise robotic massage trajectories. The team evaluated several existing MLLMs on their new benchmark and demonstrated the framework's practical viability by fine-tuning Alibaba's Qwen-VL model. Physical experiments confirmed the system's applicability, showing how AI can bridge the gap between language understanding and physical action in rehabilitation settings.

Key Points
  • Introduced MedMassage-12K dataset with 12,190 images and 174,177 QA pairs for training multimodal AI in acupoint therapy
  • Proposed a hierarchical framework combining MLLM-based acupoint grounding with low-level robotic trajectory control
  • Fine-tuned Qwen-VL model and established the first benchmark for embodied massage tasks, with code and data publicly released

Why It Matters

This work pioneers standardized benchmarks for AI-driven physical therapy, potentially enabling scalable, personalized robotic rehabilitation.