Safety
Efficient Hallucination Detection for LLMs Using Uncertainty-Aware Attention Heads
The paper introduces Recurrent Attention-based Uncertainty Quantification (RAUQ), an unsupervised method for detecting hallucinations in large language models (LLMs) by analyzing "uncertainty-aware" attention heads. RAUQ efficiently estimates sequence-level uncertainty in a single forward pass, demonstrating superior performance over existing uncertainty quantification methods across twelve datasets with minimal computational overhead (less than 1% additional computation). This lightweight approach allows practitioners to implement real-time hallucination detection in LLMs without the need for labeled data or extensive parameter tuning.
hallucinationdetectionuncertainty