Inference
Sigma-Branch: Hierarchical Single-Path Network Reconstruction for Dynamic Inference with Reduced Active Parameters
Sigma-Branch (SigmaB) is a novel framework that restructures pretrained dense networks into a hierarchical binary tree, enabling dynamic inference with reduced active parameters while retaining the complete dense parameter set in memory. By employing activation-based spherical k-means clustering for weight distribution and soft-routing fine-tuning, SigmaB-Net achieves a 58-60% reduction in per-inference active parameters compared to dense baselines, with only a 1.72 percentage point drop in Top-1 accuracy across multiple datasets, including CIFAR-100 and ImageNet-1K. This approach significantly outperforms static structured pruning techniques, making it a valuable solution for deploying neural networks on memory-constrained edge devices.
model compressiondynamic inferenceactive parameters