SafetyarXiv cs.AI — 7 d ago

Zero-source LLM Hallucination Detection with Human-like Criteria Probing

The paper presents Human-like Criteria Probing for Hallucination Detection (HCPD), a novel method for detecting hallucinations in large language models (LLMs) under zero-source conditions, relying solely on the text of query-answer pairs. HCPD utilizes a reward-based alignment scheme and a multi-sampling aggregation strategy to produce interpretable truthfulness measures, demonstrating superior performance against state-of-the-art baselines in extensive experiments. This approach is significant for practitioners as it enhances the reliability and explainability of hallucination detection in LLMs, which is crucial for ensuring safe deployment in real-world applications.

hallucinationllmdetectionrelevance 0.00 · engagement 0.00

Read at source ↗← all news