ai-digest.dev
last updated 4 h ago
AgentsarXiv cs.AI 10 d ago

Semantic Flip: Synthetic OOD Generation for Robust Refusal in Embodied Question Answering and Spatial Localization

The paper introduces Semantic Flip, a framework for generating synthetic out-of-distribution (OOD) samples to improve refusal capabilities in embodied vision-language models (VLMs) during tasks like Embodied Question Answering and spatial localization. By transforming queries and video memory to create OOD pairs, it enables the training of a lightweight rejection module that can be integrated into existing VLM pipelines without retraining. The approach demonstrates superior performance on two benchmarks, achieving an F1 score of 0.9559 on the newly introduced SpaceReject benchmark, highlighting its significance for enhancing the reliability of embodied agents in real-world applications.

embodied agentsquestion answeringspatial reasoningrelevance 0.00 · engagement 0.00
Read at source ↗← all news
Semantic Flip: Synthetic OOD Generation for Robust Refusal in Embodied Question Answering and Spatial Localization — AI News Digest