MiLoRA: Harnessing Minor Singular Components for Parameter-Efficient LLM Finetuning

MiLoRA: Harnessing Minor Singular Components for Parameter-Efficient LLM Finetuning

2024 | Hanqing Wang, Zeguan Xiao, Yixia Li, Shuo Wang, Guanhua Chen, Yun Chen
MiLoRA is a parameter-efficient fine-tuning method for large language models (LLMs) that focuses on the minor singular components of weight matrices. Unlike traditional LoRA methods that initialize low-rank matrices with random values, MiLoRA decomposes the weight matrix using singular value decomposition (SVD) and keeps the principle singular components frozen while updating the minor components. This approach preserves the pretrained knowledge while optimizing the less-optimized subspace for learning. The minor singular components are associated with noisy or long-tail information, while the principle components contain important knowledge. By initializing the low-rank matrices in a subspace orthogonal to the principle matrix, MiLoRA ensures that the pretrained knowledge is well-preserved. Extensive experiments on commonsense reasoning, math reasoning, and instruction following benchmarks show that MiLoRA outperforms LoRA and PiSSA in terms of performance without sacrificing training or inference efficiency. The method is simple to implement and effective in learning from the finetuning dataset while maintaining the pretrained knowledge. MiLoRA is also compared with PiSSA, which adapts the principle singular components, and shows superior performance under appropriate hyperparameter settings. The results indicate that MiLoRA is a highly effective method for parameter-efficient LLM fine-tuning.MiLoRA is a parameter-efficient fine-tuning method for large language models (LLMs) that focuses on the minor singular components of weight matrices. Unlike traditional LoRA methods that initialize low-rank matrices with random values, MiLoRA decomposes the weight matrix using singular value decomposition (SVD) and keeps the principle singular components frozen while updating the minor components. This approach preserves the pretrained knowledge while optimizing the less-optimized subspace for learning. The minor singular components are associated with noisy or long-tail information, while the principle components contain important knowledge. By initializing the low-rank matrices in a subspace orthogonal to the principle matrix, MiLoRA ensures that the pretrained knowledge is well-preserved. Extensive experiments on commonsense reasoning, math reasoning, and instruction following benchmarks show that MiLoRA outperforms LoRA and PiSSA in terms of performance without sacrificing training or inference efficiency. The method is simple to implement and effective in learning from the finetuning dataset while maintaining the pretrained knowledge. MiLoRA is also compared with PiSSA, which adapts the principle singular components, and shows superior performance under appropriate hyperparameter settings. The results indicate that MiLoRA is a highly effective method for parameter-efficient LLM fine-tuning.
Reach us at info@study.space
[slides and audio] MiLoRA%3A Harnessing Minor Singular Components for Parameter-Efficient LLM Finetuning