ResearcharXiv cs.CL — 16 d ago

PASQA: Pitch-Accent-Focused Speech Quality Assessment Model Trained on Synthetic Speech with Accent Errors

The article presents the Pitch-Accent-focused Speech Quality Assessment (PASQA) model, designed to improve the prediction of pitch-accent correctness in speech quality assessments. PASQA utilizes a controlled dataset of synthetic Japanese speech with accent errors, incorporating techniques such as self-supervised representations, mora-conditioned fusion, and an auxiliary accent-error localization task. This model demonstrates superior performance in ordering accent-error severity and aligns more closely with human judgments compared to traditional mean opinion score models, making it a valuable tool for practitioners focusing on nuanced speech evaluation in AI applications.

speech-qualityaccent-errorsmosllmrelevance 0.00 · engagement 0.00

Read at source ↗← all news