Agents
ATOM-Bench: A Real-World Benchmark for Atomic Skills and Compositional Generalization in Manipulation Policies
ATOM-Bench is a newly introduced benchmark designed to evaluate atomic skills and compositional generalization in robotic manipulation policies. It consists of 30 atomic tasks and 24 held-out compositional tasks, utilizing 3,000 human demonstrations for fine-tuning policies on both single-arm and dual-arm robot tracks. The benchmark includes metrics such as Atomic Score (AS) and Compositional Failure Share (CFS) to analyze performance, revealing that while current policies can learn basic instruction-grounding skills, they often struggle with fine-grained motor skills and the transferability of learned skills to new tasks, highlighting critical areas for improvement in real-world robotic applications.
manipulation policiesrobotic controlbenchmark