Multimodal
Self-EmoQ: Plutchik-Guided Value-based Planning to Drive Streaming Emotional TTS
The article presents Self-EmoQ, a novel emotion-planning framework for streaming text-to-speech (TTS) synthesis that integrates a self-emotion determination mechanism. It leverages a plug-and-play LLM module, initialized from pretrained models and trained via reinforcement learning, using Plutchik's wheel of emotions for action selection. Experimental results on datasets like DailyDialog and MELD show significant improvements in emotion determination and response quality compared to traditional prompting and finetuning approaches, making it a valuable tool for enhancing emotional interaction in conversational AI systems.
emotional interactionttsllm