TrainingarXiv cs.CL — 2 d ago

On Cost-Effective LLM-as-a-Judge Improvement Techniques

The paper presents four cost-effective techniques to enhance the accuracy of language model judges in reinforcement learning from human feedback (RLHF) frameworks: ensemble scoring, task-specific criteria injection, calibration context, and adaptive model escalation. Empirical results on RewardBench 2 demonstrate that ensemble scoring combined with criteria injection achieves an accuracy of 85.8%, a 13.5 percentage point improvement over the baseline, with small models benefiting significantly from these methods. This work is significant for practitioners as it provides scalable strategies for improving LLM evaluation reliability without incurring substantial costs, making high-accuracy assessments more accessible.

llmrlhfensembleaccuracynoiserelevance 0.00 · engagement 0.00

Read at source ↗← all news