ai-digest.dev
last updated 13 h ago
ResearcharXiv cs.CL 7 d ago

PiDA: Phonetically-Informed Data Augmentation for Robust Vietnamese Speech Translation

The article presents Phonetically-Informed Data Augmentation (PiDA), a method designed to enhance the robustness of Vietnamese speech translation systems by addressing substitution errors in Automatic Speech Recognition (ASR). By leveraging phonetic word embeddings to create ASR-like corruptions, PiDA improves translation accuracy on erroneous ASR outputs, achieving a BLEU score increase of up to +2.04 compared to standard fine-tuning on the FLEURS Vietnamese-English dataset. This approach is significant for practitioners as it provides a systematic way to mitigate ASR error propagation in cascaded speech translation systems, ultimately enhancing translation quality.

speech-translationdata-augmentationASRrelevance 0.00 · engagement 0.00
Read at source ↗← all news
PiDA: Phonetically-Informed Data Augmentation for Robust Vietnamese Speech Translation — AI News Digest