Agents
From Digital to Physical: Digital Agents as Autonomous Coaches for Physical Intelligence
The paper introduces \textsc{EmboCoach-Bench}, a benchmark designed to evaluate the ability of LLM agents to autonomously generate embodied policies across 32 expert-curated reinforcement learning (RL) and imitation learning (IL) tasks. Key findings demonstrate that these autonomous agents can outperform human-engineered baselines by 26.5% in success rates and effectively utilize environment feedback for iterative policy optimization, thereby reducing reliance on manual tuning. This advancement is significant for practitioners, as it paves the way for scalable, self-evolving approaches in the development of embodied AI systems.
embodied-aillmbenchmark