Training
Quantized Evolution Strategies: High-precision Fine-tuning of Quantized LLMs at Low-precision Cost
The paper introduces Quantized Evolution Strategies (QES), an optimization method designed for fine-tuning quantized Large Language Models (LLMs) directly in the quantized parameter space. QES improves upon traditional fine-tuning methods by integrating accumulated error feedback for high-precision weight updates and employing a stateless seed replay to minimize memory usage. This approach significantly enhances performance over existing zeroth-order fine-tuning techniques, enabling effective scaling and deployment of LLMs on memory-constrained devices.
quantizationfine-tuningllm