Inference
CGES: Confidence-Guided Early Stopping for Efficient and Accurate Self-Consistency
The paper introduces Confidence-Guided Early Stopping (CGES), a Bayesian framework designed to enhance the efficiency of the self-consistency method for large language models (LLMs) by adaptively halting sampling based on the posterior mass of candidate answers. CGES demonstrates a 58% reduction in average calls (from 16.0 to 6.7) across five reasoning benchmarks while maintaining accuracy within 0.4 percentage points of the traditional self-consistency approach. This method is significant for practitioners as it allows for more efficient querying of LLMs, reducing computational costs without sacrificing performance.
llmself-consistencyearly stopping