Inference
Towards Data-free and Training-free Compression for Speech Foundation Models Using Parameter Clustering
This paper introduces a data-free and training-free compression technique for speech foundation models using channelwise clustering with k-means. The method demonstrates significant word error rate (WER) reductions of up to 27.73% absolute on HuBERT-large and 5.02% absolute on Whisper-large-v3, even at high sparsity levels of 50% and 10%, respectively, without fine-tuning. This approach is valuable for practitioners aiming to optimize model efficiency while maintaining performance in speech applications.
compressionspeech modelsparameter clustering