SafetyarXiv cs.AI — 10 d ago

Robust Spoofed Speech Detection via Temporal Pyramid Modeling

The paper introduces a Temporal Pyramid Adapter for spoofed speech detection, employing parallel temporal convolutions with varying receptive fields to enhance the model's ability to identify multi-scale spoofing cues. The model integrates self-supervised XLS-R representations and achieves a state-of-the-art AUC of 99.24% and an EER of 3.87% on the PartialSpoof dataset, outperforming existing models like LCNN-BLSTM and TRACE. This advancement is crucial for practitioners as it addresses the challenges of cross-dataset generalization and highlights the importance of adaptation strategies in maintaining performance across different domains and languages.

spoofed speech detectionvoice conversionsecurityrelevance 0.00 · engagement 0.00

Read at source ↗← all news