Agents
State-Grounded Multi-Agent Synthetic Data Generation for Tool-Augmented LLMs
The article introduces StateGen, a synthetic data generation platform designed for training tool-augmented LLM agents by creating multi-turn, tool-grounded conversational data. It features a unique architecture with a state manager that maintains a structured world-state object, significantly reducing tool-call hallucinations, and supports hierarchical multi-agent configurations. Evaluation results indicate high performance with a 9.66/10 score on tool-call hallucination metrics across 64,698 conversations, making it a valuable resource for practitioners needing robust training data for complex LLM applications.
data generationtool-augmentedllm