Training
Squeeze-Release: Iterative Pruning with Exact Structural Minimization
The paper introduces the Squeeze-Release method, which combines iterative pruning with exact structural minimization to convert masked networks into smaller dense networks while preserving the forward function. This approach achieves significant model compression, demonstrating a 39x reduction in size for fully-connected networks and 14.8x for ConvNeXt-Tiny, without compromising accuracy. Additionally, the authors present CompensatedLayerNorm, enhancing the method's applicability to LayerNorm-equipped architectures, including transformers, making it a valuable technique for practitioners aiming to optimize model deployment.
pruningmodel-compressiontraining-techniques