Research
Entity Binding Failures in Speech LLM Reasoning: Diagnosis and Chain-of-Thought Intervention
The paper presents a diagnosis of entity binding failures in Speech Large Language Models (SLLMs) during complex reasoning tasks, demonstrating that while SLLMs can perform well in spatial, syntactic, and factual tasks, they struggle with logical tasks due to a collapse in accuracy linked to entity tracking. The authors introduce the Entity-Aware Chain-of-Thought (EA-CoT) intervention, which enhances SLLM performance by ensuring explicit binding of entities to claims, achieving accuracy improvements of up to 24.4 percentage points. This work is significant for AI practitioners as it reframes the understanding of SLLM limitations and provides a method to improve reasoning capabilities in speech-based models.
llmspeechreasoning