AgentsarXiv cs.AI — 7 d ago

From Imitation to Alignment: Human-Preference Flow Policies for Long-Horizon Sidewalk Navigation

The article presents FlowPilot, a novel mapless navigation policy designed for long-horizon sidewalk navigation using only a monocular RGB camera. It employs anchored flow matching for policy pre-training on large-scale robot fleet data, enhancing counterfactual reasoning and social compliance through a human-in-the-loop preference learning scheme. FlowPilot demonstrates a 42% success rate and 66% route completion in simulations, with FlowPilot-HP showing improved real-world performance, reducing incident rates by 40% and non-injury rates by 52% compared to the base model, which is significant for practitioners focusing on autonomous navigation in complex environments.

navigationimitation-learninglong-horizonpolicyrelevance 0.00 · engagement 0.00

Read at source ↗← all news