ResearcharXiv cs.AI — 21 h ago

The Model Says Walk: How Surface Heuristics Override Implicit Constraints in LLM Reasoning

The paper introduces the Heuristic Override Benchmark (HOB), consisting of 500 instances across various heuristic and constraint families, to evaluate large language models (LLMs) on reasoning tasks. Analysis of six models, including Gemini 3.1 Pro, reveals that surface cues can significantly override implicit constraints, with no model exceeding 75% performance on strict evaluations; a minimal hint can improve performance by 15 percentage points, indicating a constraint-inference failure. This research highlights the critical need for practitioners to understand and mitigate heuristic biases in LLMs, as explicit goal decomposition and internal deliberation can enhance reasoning capabilities.

llmreasoningheuristicsrelevance 0.00 · engagement 0.00

Read at source ↗← all news