Agents
Divide and Cooperate: Role-Decomposed Multi-Agent LLM Training with Cross-Agent Learning Signals
The paper introduces DAC (Divide and Cooperate), a role-decomposed multi-agent training framework that separates the tasks of evidence acquisition and answer generation into distinct agents, mitigating the combinatorial policy space and credit assignment issues present in traditional single-policy models. DAC employs parameter-efficient LoRA modules over a shared backbone, demonstrating improved performance on general and multi-hop QA benchmarks compared to full fine-tuning approaches. This framework offers practitioners a more efficient method for training LLMs in complex reasoning tasks by leveraging structured cross-agent learning signals.
multi-agenttrainingrole-decomposed