Mitigating scalability challenges in LUT-based neural networks via pruning optimisations
The paper presents a scalable and energy-efficient Look-Up Table (LUT)-based approximate matrix multiplication unit (LUT-MU) that integrates a pruning strategy with the MADDNESS algorithm to address scalability issues in LUT-based neural networks. The proposed architecture demonstrates significant performance improvements, achieving up to 1.6× throughput and 4.2× energy efficiency gains over CUDA-based implementations, while also providing 1.3 to 2.6× resource savings compared to original MADDNESS networks across various configurations. This advancement is crucial for practitioners as it enhances the efficiency of deploying neural networks on resource-constrained hardware like FPGAs, particularly for applications requiring high precision and scalability.