ai-digest.dev
last updated 7 min ago
ResearcharXiv cs.CL 2 d ago

Enhancing Multilingual LLM-based ASR with Mixture of Experts and Dynamic Downsampling

The paper presents a projector-based LLM-ASR framework that integrates a Mixture of Experts (MoE) architecture and a Continuous Integrate-and-Fire (CIF) mechanism to enhance multilingual generalization and modality alignment in automatic speech recognition. Experimental results demonstrate significant performance improvements over strong baseline models, indicating that this approach could lead to more accurate and robust LLM-based ASR systems. This advancement is crucial for practitioners aiming to build effective multilingual ASR applications leveraging LLMs.

asrmultilingualllmrelevance 0.00 · engagement 0.00
Read at source ↗← all news