TrainingarXiv cs.AI — 8 d ago

The Curse and Blessing of Mean Bias in FP4-Quantized LLM Training

The article introduces Averis, a mean-residual splitting quantization method designed to enhance FP4 training for large language models by addressing issues related to activation outliers caused by a coherent rank-one mean bias. This method separates the mean component prior to quantization, leading to improved robustness in training Qwen3 models, with loss gaps reduced to 1.19% and 0.81% compared to NVIDIA's Hadamard-based method. Averis offers a hardware-efficient solution with only 2.20% overhead over vanilla NVFP4, making it a significant advancement for practitioners aiming to optimize low-bit quantization in LLMs.

quantizationllmfp4relevance 0.00 · engagement 0.00

Read at source ↗← all news