ResearcharXiv cs.AI — 21 h ago

What Do Deepfake Speech Detectors Actually Hear?

The paper presents an audio-native explainability pipeline for deepfake speech detectors, utilizing Integrated Gradients on time-aligned self-supervised representations to localize decision evidence in audio signals. The approach was tested on three WavLM-based detectors (AASIST, CA-MHFA, SLS) using the ASVspoof 5 dataset, revealing that while performance is similar across models, they rely on distinct cues for detection. This work provides valuable insights into the operational semantics of these detectors, which can inform practitioners about the specific features that influence detection outcomes and improve model interpretability in real-world applications.

deepfakeexplainabilityaudiorelevance 0.00 · engagement 0.00

Read at source ↗← all news