Research
Quantifying Consistency in LLM Logical Reasoning via Structural Uncertainty
The paper introduces a framework called structural uncertainty, which quantifies the consistency of logical reasoning in large language models (LLMs) by analyzing self-preference rankings among candidate solutions. This approach uses Bradley-Terry modeling with PageRank to aggregate pairwise preferences and decomposes the results into two components: across-trial ranking instability and within-trial candidate ambiguity. The findings indicate that structural signals enhance the identification of unreliable reasoning instances across multiple LLMs and benchmarks, highlighting the importance of evaluating reasoning consistency for practitioners developing LLMs in logical and mathematical tasks.
llmlogical reasoningconsistency