Researchers at Tufts University led by Matthias Scheutz published an arXiv paper showing a neuro-symbolic visual-language-action (VLA) model outperforms standard VLAs on robotic manipulation: 95% vs. 34% success on Tower of Hanoi tasks, and 78% vs. 0% on unseen puzzle variations. Training took 34 minutes and consumed 1% of the energy required by conventional VLA training (which runs over 36 hours); operational energy was 5% of the standard approach. The work challenges the assumption that scaling neural networks is the only viable path for capable robotic AI and will be presented at ICRA in Vienna in June 2026.

Tufts neuro-symbolic robotics system achieves 95% accuracy at 1% the training energy of standard VLAs

Citations