Dual-Branch Gated Fusion for Open-Set Audio Deepfake Source Tracing
The paper presents a dual-branch gated fusion framework for audio deepfake source tracing, integrating XLSR-53 with a 66-dimensional CORES descriptor that captures diverse synthesis artifacts. The model demonstrates a 97.6% in-domain accuracy and a 4.9% equal error rate on the MLAAD benchmark, significantly outperforming the Interspeech 2025 baseline with an 83.5% relative reduction in false positive rate at 95% recall. This work is crucial for practitioners as it addresses the limitations of closed-set models in handling unseen synthesizers and enhances generalization under distribution shifts.