Agents
LaWAM: Latent World Action Models for Efficient Dynamics-Aware Robot Policies
LaWAM (Latent World Action Model) introduces a novel approach for efficient robot policy generation by leveraging compact latent visual subgoals instead of computationally intensive video generation. It utilizes a latent-action-conditioned Latent World Model (LaWM) trained in the latent space of a pretrained vision foundation model, achieving state-of-the-art success rates of 98.6% on LIBERO and 91.22% on RoboTwin, while maintaining low-latency inference at 187 ms per action-chunk prediction. This advancement is significant for practitioners as it enables dynamics-aware robot control with reduced computational overhead, enhancing real-time performance in robotic applications.
roboticspolicyaction-models