MELoRA: Mini-Ensemble Low-Rank Adapters for Parameter-Efficient Fine-Tuning

MELoRA: Mini-Ensemble Low-Rank Adapters for Parameter-Efficient Fine-Tuning

24 Jun 2024 | Pengjie Ren, Chengshun Shi, Shiguang Wu, Mengqi Zhang, Zhaochun Ren, Maarten de Rijke, Zhumin Chen, Jiahuan Pei
MELoRA is a parameter-efficient fine-tuning method that improves performance with fewer trainable parameters by using a mini-ensemble of low-rank adapters. It freezes original pretrained weights and trains multiple mini LoRAs in parallel, allowing for a higher rank without increasing the number of parameters. This approach captures diversity among mini LoRAs, enhancing generalization. Theoretical analysis and empirical studies on various NLP tasks show that MELoRA outperforms LoRA, achieving better performance with significantly fewer parameters. For natural language understanding tasks, MELoRA uses 8 times fewer parameters than LoRA, and for instruction following tasks, it uses 36 times fewer parameters. MELoRA maintains a higher and flexible rank while reducing computational complexity. It is effective across diverse tasks and models, demonstrating superior performance with fewer parameters. The method is compared with LoRA and other variants, showing consistent improvements in performance. MELoRA's key advantages include higher rank with fewer parameters, flexible rank adjustment, and lower computational complexity. Experiments on GLUE and INSTRUCTEVAL benchmarks confirm its effectiveness, with MELoRA achieving better results across multiple datasets. The method is theoretically supported and empirically validated, showing its potential for parameter-efficient fine-tuning.MELoRA is a parameter-efficient fine-tuning method that improves performance with fewer trainable parameters by using a mini-ensemble of low-rank adapters. It freezes original pretrained weights and trains multiple mini LoRAs in parallel, allowing for a higher rank without increasing the number of parameters. This approach captures diversity among mini LoRAs, enhancing generalization. Theoretical analysis and empirical studies on various NLP tasks show that MELoRA outperforms LoRA, achieving better performance with significantly fewer parameters. For natural language understanding tasks, MELoRA uses 8 times fewer parameters than LoRA, and for instruction following tasks, it uses 36 times fewer parameters. MELoRA maintains a higher and flexible rank while reducing computational complexity. It is effective across diverse tasks and models, demonstrating superior performance with fewer parameters. The method is compared with LoRA and other variants, showing consistent improvements in performance. MELoRA's key advantages include higher rank with fewer parameters, flexible rank adjustment, and lower computational complexity. Experiments on GLUE and INSTRUCTEVAL benchmarks confirm its effectiveness, with MELoRA achieving better results across multiple datasets. The method is theoretically supported and empirically validated, showing its potential for parameter-efficient fine-tuning.
Reach us at info@study.space
Understanding Mini-Ensemble Low-Rank Adapters for Parameter-Efficient Fine-Tuning