Agents
Multi-Faceted Interactivity Alignment in Full-Duplex Speech Models
The paper presents a novel post-training alignment method for full-duplex spoken dialogue models, specifically targeting interactivity issues such as pause handling, turn-taking, backchanneling, and user interruption through reinforcement learning (RL) with axis-specific reward functions. The approach was evaluated on open-source models Moshi and PersonaPlex, yielding consistent improvements in interactivity during both offline and real-time multi-turn dialogue evaluations. This advancement is significant for practitioners as it enhances the conversational dynamics of dialogue systems, enabling more natural interactions in applications.
dialoguespeechalignment