PiSSA: Principal Singular Values and Singular Vectors Adaptation of Large Language Models

PiSSA: Principal Singular Values and Singular Vectors Adaptation of Large Language Models

28 May 2024 | Fanxu Meng, Zhaohui Wang, Muhan Zhang
PiSSA is a parameter-efficient fine-tuning method for large language models (LLMs) that improves upon the Low-Rank Adaptation (LoRA) technique. Unlike LoRA, which initializes the adapter matrices with Gaussian noise and zeros, PiSSA uses the principal singular values and vectors of the original weight matrix to initialize the adapter matrices. This approach allows PiSSA to converge faster and achieve better performance than LoRA. PiSSA also shares the same architecture as LoRA, making it compatible with quantization, which further reduces the memory requirements of fine-tuning. Comparative experiments across 12 different models, ranging from 184M to 70B parameters, show that PiSSA consistently outperforms LoRA in both natural language generation (NLG) and natural language understanding (NLU) tasks. On the GSM8K benchmark, Mistral-7B fine-tuned with PiSSA achieves an accuracy of 72.86%, surpassing LoRA's 67.7% by 5.16%. Additionally, PiSSA's quantization error is significantly smaller than that of QLoRA, making it a more effective method for quantized fine-tuning. The experiments also show that PiSSA achieves better performance across various model sizes, types, and training data amounts. PiSSA provides a novel approach to parameter-efficient fine-tuning by focusing on the principal components of the model, similar to slicing and re-baking the richest slice of a pizza. Despite its advantages, PiSSA has some limitations, such as the need for further research on its applicability to other types of models and potential improvements from existing LoRA techniques. Overall, PiSSA offers a more efficient and effective method for fine-tuning large language models.PiSSA is a parameter-efficient fine-tuning method for large language models (LLMs) that improves upon the Low-Rank Adaptation (LoRA) technique. Unlike LoRA, which initializes the adapter matrices with Gaussian noise and zeros, PiSSA uses the principal singular values and vectors of the original weight matrix to initialize the adapter matrices. This approach allows PiSSA to converge faster and achieve better performance than LoRA. PiSSA also shares the same architecture as LoRA, making it compatible with quantization, which further reduces the memory requirements of fine-tuning. Comparative experiments across 12 different models, ranging from 184M to 70B parameters, show that PiSSA consistently outperforms LoRA in both natural language generation (NLG) and natural language understanding (NLU) tasks. On the GSM8K benchmark, Mistral-7B fine-tuned with PiSSA achieves an accuracy of 72.86%, surpassing LoRA's 67.7% by 5.16%. Additionally, PiSSA's quantization error is significantly smaller than that of QLoRA, making it a more effective method for quantized fine-tuning. The experiments also show that PiSSA achieves better performance across various model sizes, types, and training data amounts. PiSSA provides a novel approach to parameter-efficient fine-tuning by focusing on the principal components of the model, similar to slicing and re-baking the richest slice of a pizza. Despite its advantages, PiSSA has some limitations, such as the need for further research on its applicability to other types of models and potential improvements from existing LoRA techniques. Overall, PiSSA offers a more efficient and effective method for fine-tuning large language models.
Reach us at info@study.space
[slides] PiSSA%3A Principal Singular Values and Singular Vectors Adaptation of Large Language Models | StudySpace