Agents
Adaptive Turn-Taking for Real-time Multi-Party Voice Agents
ModeratorLM, a role-playing voice agent designed for multi-party conversations, has been introduced to address turn-taking challenges in dynamic settings. This system utilizes a speech large language model with chunk-wise streaming capabilities and includes a reasoning-augmented variant that applies chain-of-thought reasoning based on assigned roles. Experiments demonstrate significant improvements in turn-taking precision (over 40%) and recall (over 70%) on both real-world meeting data and a newly created dataset, RolePlayConv, which features diverse assistant roles, thereby enhancing the effectiveness of voice agents in collaborative environments.
voice agentsturn-takingmoderatorlm