Research
ReNikud: Audio-Supervised Hebrew Grapheme-to-Phoneme Conversion
ReNikud introduces a novel approach to grapheme-to-phoneme (G2P) conversion for Modern Hebrew by leveraging weak audio supervision through a phoneme-based automatic speech recognition (ASR) pseudo-labeling pipeline, utilizing thousands of hours of unlabeled audio. This method features a pseudo-vocalization architecture that predicts IPA phonemes at each character position, enhancing character-level alignment. The results demonstrate superior performance on existing Hebrew G2P benchmarks and the new MILIM benchmark for spoken Hebrew, marking a significant advancement for applications in text-to-speech and speech technologies, with the release of code and trained models to facilitate further research.
g2phebrewaudio