SafetyarXiv cs.AI — 7 d ago

The Perceived Fragility of Explanations in Audio Models: Manipulation of Attribution with Unchanged Predictions

This paper presents a psychoacoustic framework that reveals the fragility of post-hoc explanation methods in audio deepfake detection, demonstrating how adversaries can manipulate explanation heatmaps without altering model predictions. The study evaluates various state-of-the-art architectures under strict constraints, using domain-specific perceptual audio quality metrics to assess manipulation costs. This work is significant for practitioners as it highlights vulnerabilities in audio model interpretability, emphasizing the need for robust explanation techniques in AI systems.

explanationaudio-modelsdeepfake-detectionrelevance 0.00 · engagement 0.00

Read at source ↗← all news