[slides and audio] MiLoRA%3A Harnessing Minor Singular Components for Parameter-Efficient LLM Finetuning

MiLoRA: Harnessing Minor Singular Components for Parameter-Efficient LLM Finetuning **Abstract:** Efficient finetuning of large language models (LLMs) aims to adapt these models with reduced computation and memory cost. Previous LoRA-based approaches initialize low-rank matrices with Gaussian distribution and zero values, while keeping the original weight matrices frozen. However, this approach may interfere with the well-learned subspace of the pre-trained weight matrix. This paper proposes MiLoRA, a simple yet effective LLM finetuning method that only updates the minor singular components of the weight matrix while keeping the principal singular components frozen. The minor matrix corresponds to noisy or long-tail information, while the principal matrix contains important knowledge. MiLoRA initializes the low-rank matrices within a subspace orthogonal to the principal matrix, preserving the pretrained knowledge. Extensive experiments on commonsense reasoning, math reasoning, and instruction-following benchmarks demonstrate the superior performance of MiLoRA compared to LoRA and PiSSA without sacrificing training or inference efficiency. **Introduction:** Large language models (LLMs) have demonstrated superior performance on various tasks, but full finetuning requires substantial computational resources. Parameter-efficient finetuning (PEFT) methods aim to reduce these costs. LoRA is a widely used PEFT method that assumes linear model weight updates are low-rank. However, existing LoRA-based approaches randomly initialize the low-rank matrices, potentially overriding important pretrained features. MiLoRA addresses this issue by initializing the low-rank matrices within a subspace orthogonal to the principal matrix, effectively learning from the finetuning dataset while preserving pretrained knowledge. **Methodology:** MiLoRA decomposes the weight matrix using singular value decomposition (SVD) and divides it into principal and minor matrices based on singular values. The principal matrix captures essential knowledge, while the minor matrix contains noisy or long-tail information. MiLoRA keeps the principal matrix frozen and adapts the minor singular components during finetuning. This approach encourages the model to learn in the less-optimized subspace, reducing interference with pretrained knowledge. **Experiments:** MiLoRA is evaluated on three diverse tasks: commonsense reasoning, math reasoning, and instruction-following. Results show that MiLoRA consistently outperforms LoRA and PiSSA, achieving superior performance on various datasets without sacrificing efficiency. **Conclusion:** MiLoRA is a simple yet effective PEFT method that enhances model performance with lower training costs. It promotes the broader application of large models across diverse groups but also highlights the need to prevent the misuse of PEFT methods.MiLoRA: Harnessing Minor Singular Components for Parameter-Efficient LLM Finetuning **Abstract:** Efficient finetuning of large language models (LLMs) aims to adapt these models with reduced computation and memory cost. Previous LoRA-based approaches initialize low-rank matrices with Gaussian distribution and zero values, while keeping the original weight matrices frozen. However, this approach may interfere with the well-learned subspace of the pre-trained weight matrix. This paper proposes MiLoRA, a simple yet effective LLM finetuning method that only updates the minor singular components of the weight matrix while keeping the principal singular components frozen. The minor matrix corresponds to noisy or long-tail information, while the principal matrix contains important knowledge. MiLoRA initializes the low-rank matrices within a subspace orthogonal to the principal matrix, preserving the pretrained knowledge. Extensive experiments on commonsense reasoning, math reasoning, and instruction-following benchmarks demonstrate the superior performance of MiLoRA compared to LoRA and PiSSA without sacrificing training or inference efficiency. **Introduction:** Large language models (LLMs) have demonstrated superior performance on various tasks, but full finetuning requires substantial computational resources. Parameter-efficient finetuning (PEFT) methods aim to reduce these costs. LoRA is a widely used PEFT method that assumes linear model weight updates are low-rank. However, existing LoRA-based approaches randomly initialize the low-rank matrices, potentially overriding important pretrained features. MiLoRA addresses this issue by initializing the low-rank matrices within a subspace orthogonal to the principal matrix, effectively learning from the finetuning dataset while preserving pretrained knowledge. **Methodology:** MiLoRA decomposes the weight matrix using singular value decomposition (SVD) and divides it into principal and minor matrices based on singular values. The principal matrix captures essential knowledge, while the minor matrix contains noisy or long-tail information. MiLoRA keeps the principal matrix frozen and adapts the minor singular components during finetuning. This approach encourages the model to learn in the less-optimized subspace, reducing interference with pretrained knowledge. **Experiments:** MiLoRA is evaluated on three diverse tasks: commonsense reasoning, math reasoning, and instruction-following. Results show that MiLoRA consistently outperforms LoRA and PiSSA, achieving superior performance on various datasets without sacrificing efficiency. **Conclusion:** MiLoRA is a simple yet effective PEFT method that enhances model performance with lower training costs. It promotes the broader application of large models across diverse groups but also highlights the need to prevent the misuse of PEFT methods.

MiLoRA: Harnessing Minor Singular Components for Parameter-Efficient LLM Finetuning

13 Jun 2024 | Hanhqing Wang1*, Zeguan Xiao1*, Yixia Li2, Shuo Wang3, Guanhua Chen2, Yun Chen1

13 Jun 2024 | Hanhqing Wang1, Zeguan Xiao1, Yixia Li2, Shuo Wang3, Guanhua Chen2, Yun Chen1