CodingarXiv cs.AI — 7 d ago

HalluJudge: A Reference-Free Hallucination Detection for Context Misalignment in Code Review Automation

HalluJudge is a novel framework for detecting hallucinations in Large Language Model (LLM)-generated code review comments without requiring reference data. It employs four strategies, including structured multi-branch reasoning techniques like Tree-of-Thoughts, achieving a cost-effective hallucination assessment with an F1 score of 0.85 and an average cost of $0.009 per assessment. This tool is significant for practitioners as it enhances the reliability of AI-assisted code reviews by aligning automated outputs with developer preferences, thereby mitigating the risks associated with hallucinations.

llmcode reviewhallucinationrelevance 0.00 · engagement 0.00

Read at source ↗← all news