Granularity-Regulated Adaptive Computational Efficiency for Optimal Verification in Test-Time Scaling
The paper introduces GRACE (Granularity-Regulated Adaptive Computational Efficiency), a theoretical framework for optimizing verification granularity in test-time scaling (TTS) of large language models (LLMs). It identifies a phase transition in verification strategy effectiveness based on compute budget and problem difficulty, demonstrating that fine-grained verification is superior for complex problems with ample compute, while coarse-grained verification is better for simpler tasks with limited resources. Empirical results across MATH-500, GSM8K, and AIME benchmarks show that the adaptive strategy can improve accuracy by up to 3.1% compared to fixed-granularity approaches, offering a significant advancement for practitioners in optimizing inference performance.