InferencearXiv cs.AI — 12 d ago

S4oP: Operator-level Pruning of Structured State Space Models for Resource-Constrained Devices

The paper presents a novel operator-level pruning method for Structured State Space Models (SSMs), specifically targeting the S4 and S4D architectures, to enhance their deployment in resource-constrained environments. This approach allows for the pruning of up to 70% of model operators while maintaining predictive performance, achieved through a combination of structured masking and fine-tuning within a unified training framework. The findings indicate that this method effectively reduces inference latency, making SSMs more viable for practical applications where computational resources are limited.

state-space-modelspruningresource-constraintsrelevance 0.00 · engagement 0.00

Read at source ↗← all news