AI for Industry Challenge: Classical Stack Beats ACT/smolVLA with 80% Success
550 episodes of ACT data scored 40, but 300 classical stack trajectories hit 80%.
In the AI for Industry Challenge qualification phase, Robin_Tomar used ~550 Gazebo-recorded episodes to train ACT and smolVLA policies, but only reached a max score of 40. Blue_dot moved from RL in IsaacLab to a classical vision-based stack that achieved about 80% success on sfp insertion, then collected ~300 trajectories for ACT training—which did not yield good performance. Robin_Tomar noted that data diversity and trajectory smoothness affect results, and observed policy oscillation during inference.
- Robin_Tomar achieved max score of 40 using 550 ACT/smolVLA episodes from Gazebo, with oscillating inference.
- Blue_dot got 80% success on insertion using 300 ACT trajectories from a classical vision-aware stack (no RL).
- Blue_dot's classical stack had 0% on a harder S-Port task, showing task-specific limitations.
Why It Matters
Real-world robot competition insights: mix of classical and learned methods still outperforms pure end-to-end VLA.