Agents
Discrete optimal transport is a strong audio adversarial attack
The paper introduces a novel black-box adversarial attack using discrete optimal transport (DOT) against automatic speaker verification (ASV) and anti-spoofing systems. The method aligns frame-level WavLM embeddings of generated or spoofed speech to a pool of bona fide speech using entropic optimal transport, significantly increasing the equal error rate (EER) of countermeasures and degrading ASV performance across multiple datasets, including ASVspoof2019 and ASVspoof5. This approach highlights a new attack vector that circumvents the need for model parameters or training data, emphasizing the importance of robust defenses in contemporary ASV systems.
adversarial-attackaudiospeaker-verification