Training
Steering the Noise: Turning Random Perturbations into Effective Descent for Memory-Efficient LLM Fine-Tuning
The paper presents a novel framework for fine-tuning large language models (LLMs) that enhances zeroth-order (ZO) optimization by improving descent directions derived from random perturbations. It introduces two methods, MeZO-GV and MeZO-Greedy, which utilize candidate perturbations to achieve better alignment with the optimization objective, leading to faster convergence rates. Experimental results demonstrate that the approach outperforms all ZO baselines on the OPT-13B model across 11 benchmarks and surpasses gradient-based methods on 9, while maintaining memory efficiency.
fine-tuningllmoptimization