Agents
Optimizing Agentic Reasoning with Retrieval via Synthetic Semantic Information Gain Reward
The paper introduces InfoReasoner, a framework designed to enhance agentic reasoning in large reasoning models (LRMs) by optimizing the retrieval process through a synthetic semantic information gain reward. It redefines information gain as uncertainty reduction in belief states and employs an output-aware intrinsic estimator for scalable optimization, achieving up to 5.4% accuracy improvement across seven question-answering benchmarks. This approach provides a theoretically sound method for improving information-seeking behavior in LLMs, which is crucial for practitioners aiming to build more efficient retrieval-augmented systems.
agentic-reasoningretrievalreinforcement-learning