Open Source

AllenAI's MolmoAct2: Open-Source 5B Model for Robot Control

Fully open-source VLA model with 5B parameters and multiple robotics fine-tunes...

Deep Dive

AllenAI (AI2) continues its rapid iteration on the MolmoAct series with MolmoAct2, a 5B-parameter vision-language-action (VLA) model purpose-built for robot control. Rather than releasing a single monolithic checkpoint, AllenAI is publishing a suite of fine-tunes fine-tuned on distinct robotics datasets available on Hugging Face. These include the LIBERO variant for general manipulation tasks, DROID for interactive human-in-the-loop scenarios, and BimanualYAM and SO100_101 for absolute joint-pose control. Each variant is openly downloadable, and the team has made the full training data (including pretraining data), training software source code, and accompanying technical papers available.

This open-source approach is a significant departure from many proprietary robotics models. By releasing the complete stack — weights, data, code, and theory — AllenAI allows researchers and hobbyists to not only deploy the models but also reproduce and extend the work. MolmoAct2 is designed to be used via LLM inference to command robots, making it a practical tool for anyone building generalist robot controllers. The continuous stream of new fine-tunes suggests AllenAI is treating this as an active research platform, with more variants likely forthcoming. For developers fiddling with LLM-driven robotics, MolmoAct2 offers a transparent, flexible foundation.

Key Points
  • MolmoAct2 is a 5B-parameter vision-language-action model from AllenAI, fine-tuned for specific robotics tasks like general manipulation (LIBERO) and joint-pose control (BimanualYAM).
  • All model weights, training datasets (including pretraining), source code, and academic papers are fully open-source.
  • The model is designed to be used with LLM inference for direct robot control, with new fine-tune variants being released regularly.

Why It Matters

Democratizes advanced robot control by providing fully open-source VLA models, enabling rapid prototyping for researchers and robotics engineers.