25 May 2024 | Yuxuan Yan, Shunpu Tang, Zhiguo Shi, Qianqian Yang
FeDeRA is a parameter-efficient fine-tuning method for pre-trained language models (PLMs) in federated learning (FL) settings. It extends the low-rank adaptation (LoRA) technique by decomposing pre-trained weight matrices using singular value decomposition (SVD) to initialize low-rank matrices. This approach reduces the number of trainable parameters and improves training efficiency while maintaining performance. FeDeRA outperforms existing PEFT methods and is comparable to full-parameter fine-tuning (FFT) in terms of task performance. It significantly reduces training time by over 90% and is robust to data heterogeneity, maintaining stable performance even as data distribution becomes more non-IID. The method is evaluated across various tasks and datasets, demonstrating its effectiveness in FL scenarios with limited communication and computational resources. FeDeRA's use of SVD-based initialization ensures more stable weight updates and faster convergence, making it a promising solution for efficient PLM fine-tuning in FL.FeDeRA is a parameter-efficient fine-tuning method for pre-trained language models (PLMs) in federated learning (FL) settings. It extends the low-rank adaptation (LoRA) technique by decomposing pre-trained weight matrices using singular value decomposition (SVD) to initialize low-rank matrices. This approach reduces the number of trainable parameters and improves training efficiency while maintaining performance. FeDeRA outperforms existing PEFT methods and is comparable to full-parameter fine-tuning (FFT) in terms of task performance. It significantly reduces training time by over 90% and is robust to data heterogeneity, maintaining stable performance even as data distribution becomes more non-IID. The method is evaluated across various tasks and datasets, demonstrating its effectiveness in FL scenarios with limited communication and computational resources. FeDeRA's use of SVD-based initialization ensures more stable weight updates and faster convergence, making it a promising solution for efficient PLM fine-tuning in FL.