ResearcharXiv cs.CL — 16 d ago

ReNikud: Audio-Supervised Hebrew Grapheme-to-Phoneme Conversion

ReNikud introduces a novel approach to grapheme-to-phoneme (G2P) conversion for Modern Hebrew by leveraging weak audio supervision through a phoneme-based automatic speech recognition (ASR) pseudo-labeling pipeline, utilizing thousands of hours of unlabeled audio. This method features a pseudo-vocalization architecture that predicts IPA phonemes at each character position, enhancing character-level alignment. The results demonstrate superior performance on existing Hebrew G2P benchmarks and the new MILIM benchmark for spoken Hebrew, marking a significant advancement for applications in text-to-speech and speech technologies, with the release of code and trained models to facilitate further research.

g2phebrewaudiorelevance 0.00 · engagement 0.00

Read at source ↗← all news