TrainingarXiv cs.AI — 7 d ago

LoRA-Muon: Spectral Steepest Descent on the Low-Rank Manifold

LoRA-Muon introduces a novel optimization approach for Low-Rank Adaptation (LoRA) by applying the spectral steepest-descent rule from the Muon optimizer, enhancing finetuning efficiency for deep learning models. It features improved learning rate transferability across various model dimensions and a compute-efficient design that avoids QR-decomposition and second moment storage, making it suitable for accelerator environments. In evaluations, a rank-32 LoRA-Muon configuration demonstrated lower mean validation loss compared to dense training baselines, highlighting its practical advantages for practitioners in optimizing model performance while reducing resource consumption.

low-rank adaptationfine-tuningrelevance 0.00 · engagement 0.00

Read at source ↗← all news