ai-digest.dev
last updated 2 h ago
AgentsarXiv cs.AI 9 d ago

Inference-time Policy Steering via Vision and Touch

ViTaL is a new visuo-tactile inference-time steering framework that enhances generative robot policies during deployment by integrating visual and tactile feedback for action verification. It employs a bi-level optimization approach, utilizing visual sampling for long-horizon decision-making and tactile-guided diffusion editing for short-horizon refinement, achieving a 51% improvement in success rates across contact-rich manipulation tasks compared to baseline policies. This framework is significant for practitioners as it addresses the limitations of visual-only methods in complex manipulation scenarios, enabling more effective and adaptable robotic interactions in real-world environments.

roboticsinferencemultimodalrelevance 0.00 · engagement 0.00
Read at source ↗← all news
Inference-time Policy Steering via Vision and Touch — AI News Digest