TrainingarXiv cs.CL — 11 d ago

Beyond Layer Importance in Layer-wise Sparsity: An Inter-Layer Perturbation-Absorption Perspective

This paper introduces a novel approach to layer-wise sparsity in large language models (LLMs) by examining inter-layer perturbation absorption. It empirically demonstrates that early layers amplify perturbations while middle and late layers absorb them, leading to a defined absorption coefficient per layer. This insight allows for the development of absorption-aware correction, which enhances existing pruning methods like OWL and AlphaPruning, achieving a 7.13% reduction in perplexity and a 1.02% improvement in zero-shot accuracy at 70% sparsity, providing practitioners with a more effective strategy for model compression.

llmsparsitypruningrelevance 0.00 · engagement 0.00

Read at source ↗← all news