ResearcharXiv cs.CL — 11 d ago

Cross-lingual Embedding Clustering for Hierarchical Softmax in Low-Resource Multilingual Speech Recognition

The article presents a novel cross-lingual embedding clustering method for constructing a hierarchical Softmax (H-Softmax) decoder aimed at enhancing multilingual performance in low-resource Automatic Speech Recognition (ASR) systems. By allowing similar tokens across different languages to share decoder representations, this approach overcomes limitations of the previous Huffman-based H-Softmax method, which depended on shallow feature assessments. Experimental results on a downsampled dataset of 15 languages indicate a significant improvement in ASR accuracy for low-resource languages, which is crucial for practitioners developing multilingual ASR systems.

multilingualasrlow_resourcerelevance 0.00 · engagement 0.00

Read at source ↗← all news