LoRA-XS: Low-Rank Adaptation with Extremely Small Number of Parameters

LoRA-XS: Low-Rank Adaptation with Extremely Small Number of Parameters

27 May 2024 | Klaudia Bałazy, Mohammadreza Banaei, Karl Aberer, Jacek Tabor
LoRA-XS is a novel parameter-efficient fine-tuning method that significantly reduces the number of trainable parameters while maintaining or even improving model performance. It leverages Singular Value Decomposition (SVD) to initialize adaptation matrices, aligning them with the principal components of pretrained weights. This approach allows LoRA-XS to achieve extreme parameter efficiency, reducing the number of trainable parameters by over 100x in 7B models compared to LoRA. LoRA-XS outperforms LoRA and recent state-of-the-art methods like VeRA in terms of parameter efficiency while maintaining competitive performance across various benchmarks, including GLUE, GSM8k, and MATH. The method introduces a small $ r \times r $ matrix between frozen LoRA matrices, which are derived from the SVD of the original weight matrix. This design makes the number of trainable parameters independent of the model's hidden dimensions, enabling more efficient fine-tuning, especially for larger models. LoRA-XS also provides better control over the number of additional parameters, allowing for flexible memory usage. The method retains the core advantages of LoRA, such as no modifications to the model architecture and no additional latency during inference. Experiments show that LoRA-XS achieves competitive or superior performance across different tasks and model scales, with significant reductions in storage requirements. The use of SVD-based initialization enhances model performance and training efficiency, as demonstrated through comprehensive experiments. LoRA-XS is particularly effective for instruction tuning tasks on large models, providing competitive performance with significantly fewer parameters. The method is also compatible with quantized models, further reducing memory usage and trainable parameters. Overall, LoRA-XS represents a significant advancement in parameter-efficient fine-tuning, offering substantial reductions in trainable parameters while maintaining high performance.LoRA-XS is a novel parameter-efficient fine-tuning method that significantly reduces the number of trainable parameters while maintaining or even improving model performance. It leverages Singular Value Decomposition (SVD) to initialize adaptation matrices, aligning them with the principal components of pretrained weights. This approach allows LoRA-XS to achieve extreme parameter efficiency, reducing the number of trainable parameters by over 100x in 7B models compared to LoRA. LoRA-XS outperforms LoRA and recent state-of-the-art methods like VeRA in terms of parameter efficiency while maintaining competitive performance across various benchmarks, including GLUE, GSM8k, and MATH. The method introduces a small $ r \times r $ matrix between frozen LoRA matrices, which are derived from the SVD of the original weight matrix. This design makes the number of trainable parameters independent of the model's hidden dimensions, enabling more efficient fine-tuning, especially for larger models. LoRA-XS also provides better control over the number of additional parameters, allowing for flexible memory usage. The method retains the core advantages of LoRA, such as no modifications to the model architecture and no additional latency during inference. Experiments show that LoRA-XS achieves competitive or superior performance across different tasks and model scales, with significant reductions in storage requirements. The use of SVD-based initialization enhances model performance and training efficiency, as demonstrated through comprehensive experiments. LoRA-XS is particularly effective for instruction tuning tasks on large models, providing competitive performance with significantly fewer parameters. The method is also compatible with quantized models, further reducing memory usage and trainable parameters. Overall, LoRA-XS represents a significant advancement in parameter-efficient fine-tuning, offering substantial reductions in trainable parameters while maintaining high performance.
Reach us at info@study.space