ResearcharXiv cs.AI — 12 d ago

Perceptual compensation for tonal context in self-supervised speech models

This study investigates the wav2vec2.0 architecture's ability to compensate for phonological context in Mandarin Chinese tones through a pseudo-replication of a perceptional compensation experiment. The research found no evidence of compensation in the embedding similarities of the purely self-supervised model, while the fine-tuned model showed some improvements in classifier outputs but did not achieve human-level performance. These findings indicate that supervised training may be essential to enable models to abstract certain phonological regularities, which is critical for practitioners developing robust ASR systems.

self-supervised modelsspeech recognitionphonological contextrelevance 0.00 · engagement 0.00

Read at source ↗← all news