Agents
Foresight: Iterative Reasoning About Clues that Matter for Navigation
The article introduces Foresight, a test-time framework that enhances open-world mapless navigation by utilizing pretrained Vision-Language Models (VLMs) to iteratively refine motion plans based on environmental cues and language instructions. The approach incorporates a plan-critique loop informed by human feedback, leading to a 37% improvement in task success and a 52% reduction in interventions across six real-world environments, while operating in real-time on a Jetson AGX Orin. This framework's release, along with accompanying code and data, provides valuable resources for practitioners aiming to improve robot navigation through adaptive reasoning.
navigationreasoningllmenvironment