SafetyarXiv cs.AI — 15 d ago

AURA: Adaptive Uncertainty-aware Refinement for LLM-as-a-Judge Auditing

AURA is introduced as an adaptive uncertainty-aware refinement framework designed to improve the auditing of large language models (LLMs) used as judges in open-ended generation tasks. It operates by iteratively learning a human-consistency signal and prioritizing uncertain comparisons for human review, addressing the challenges of bias and scarcity in human verification. This framework provides a stable refinement procedure and has been evaluated on both synthetic and real pairwise LLM-answer data, which is significant for practitioners seeking to enhance the reliability of LLM evaluations.

LLMauditinghuman verificationrelevance 0.00 · engagement 0.00

Read at source ↗← all news