ResearcharXiv cs.AI — 7 d ago

Mirage Probes: How Vision Models Fake Visual Understanding

The paper introduces "Mirage Probes," a contrastive probing framework that identifies two distinct failure modes in vision-language models (VLMs) that exhibit mirage behavior, where models provide confident answers to image-based questions without actual visual input. The study reveals that this behavior is detectable through internal activations across various model components and distinguishes between textual biases and spurious images, suggesting that while text-distribution cleaning can mitigate textual biases, addressing spurious images requires deeper interventions at the representational level. This insight is critical for practitioners aiming to enhance the visual grounding of VLMs in real-world applications.

vision-languagemlmmiragerelevance 0.00 · engagement 0.00

Read at source ↗← all news