Training
Closing the Feedback Loop: From Experience Extraction to Insight Governance in Verbal Reinforcement Learning
The paper presents a framework for training-free verbal reinforcement learning that allows LLM agents to learn from world feedback by extracting verbal rules and updating behavior without changing model parameters. It identifies a retention-forgetting dilemma in non-stationary environments and proposes a three-layer architecture consisting of rules, evidence, and skills, connected by a feedback-driven curation loop to enhance insight governance. This approach is validated through a financial forecasting case study, demonstrating significant improvements in accuracy and risk-adjusted returns when the curation loop is utilized.
reinforcement learningknowledgegovernance