Agents
What Matters in Orchestrating Robot Policies: A Systematic Study of Hierarchical VLA Agents
This paper presents a systematic study on hierarchical vision-language-action (Hi-VLA) systems for robot manipulation, unifying various Hi-VLA agents under an options-style control framework. The authors benchmark core design choices across different task horizons and reasoning intensities, revealing that structured model choices and interface mechanisms significantly enhance performance, yielding stronger systems compared to flat VLA control or poorly designed hierarchies. These findings provide foundational principles for practitioners aiming to develop more effective and robust Hi-VLA agents in robotic applications.
robothierarchicalVLA