Training
NeuronFabric: A Software Reference Architecture for On-Chip Transformer Training with Local Adam
NeuronFabric is a software reference architecture designed for on-chip transformer training using local Adam updates, aimed at future FPGA and ASIC implementations. It features a complete C# prototype for forward pass, backpropagation, and optimization, validated on a 334K-parameter autoregressive transformer model trained on the Shakespeare corpus. The architecture introduces the BF16W configuration, which utilizes BF16 for weight storage and FP32 for Adam optimizer moments, achieving a lower memory footprint (3.34 MB) compared to a traditional FP32 model (4.0 MB), making it suitable for on-chip training and leaving space for activation storage.
transformertrainingarchitecture