Training
MetaResearcher: Scaling Deep Research via Self-Reflective Reinforcement Learning in Adversarial Virtual Environments
MetaResearcher is a new framework for training deep research agents, addressing limitations in static environments and traditional reinforcement learning. It introduces an Evolving Virtual World to enhance source credibility assessment, Discovery-Oriented Tasks for genuine research behaviors, a Self-Reflective Meta-Reward mechanism to optimize various performance metrics, and a Heterogeneous Multi-Agent Swarm architecture for collaborative strategies. This framework aims to improve benchmark performance on GAIA and Xbench-DS while ensuring robustness against adversarial misinformation, offering significant advancements for practitioners developing AI-driven research tools.
reinforcement-learningagentstraining