TrainingarXiv cs.CL — 8 d ago

Retrospective Progress-Aware Self-Refinement for LLM Agent Training

The paper introduces Retrospective Progress-Aware Training (RePro), a framework designed to enhance the training of LLM-based agents by enabling them to self-generate progress signals through a forward-then-reflect rollout approach. Initial results indicate that RePro, which includes a Retrospection Warmup and a composite reward mechanism, significantly improves performance on tasks like WebShop, ALFWorld, and Sokoban, achieving up to a 12% absolute increase in success rates for the Qwen family of models. This advancement is crucial for practitioners as it addresses the limitations of traditional reinforcement learning in long-horizon tasks, potentially leading to more effective agent training methodologies.

llmagent-trainingreinforcement-learningrelevance 0.00 · engagement 0.00

Read at source ↗← all news