SafetyarXiv cs.AI — 11 d ago

Detect Before You Leap: Mirage Detection in Vision-Language Models

The article presents a method for detecting "mirage" responses in vision-language models (VLMs), where models provide confident answers without sufficient visual evidence. The proposed Text-Conditioned Layer-wise Internal Alignment (TC-LIA) technique evaluates layer-wise visual representations from a CLIP ViT-H/14 encoder to assess the relevance of visual evidence to the question. Notably, the Qwen2.5-VL-32B model achieved a detection accuracy of 94.7% with a 3.0% mirage rate, significantly outperforming baseline models, highlighting the importance of preemptive mirage detection for improving VQA reliability in critical applications.

visionllmdetectionrelevance 0.00 · engagement 0.00

Read at source ↗← all news