5 Sep 2024 | Herbert Woisetschläger, Alexander Erben, Shiqiang Wang, Ruben Mayer, Hans-Arno Jacobsen
This paper presents a comprehensive survey on efficient Federated Learning (FL) methods for training Foundation Models (FMs). FMs, which have gained significant traction in the deep learning community, are pre-trained on extensive datasets and can be fine-tuned for specific downstream tasks. However, the challenge lies in accessing these large datasets due to privacy concerns. FL addresses this by enabling collaborative training across decentralized clients without sharing raw data, making it suitable for FMs.
The authors introduce a novel taxonomy focused on computational and communication efficiency, which are crucial for making FL effective with FMs. They discuss various methods for computational efficiency, including full model training, parameter-efficient fine-tuning (PEFT), prompt tuning, and instruction tuning. For communication efficiency, they explore model pruning and full model compression techniques such as quantization, sparsification, and gradient projection.
The paper evaluates existing FL frameworks for their readiness to handle FMs, noting that while some frameworks support PEFT, they lack efficient communication techniques. It also highlights the need for future research on evaluating generative models in FL, the interplay between privacy and PEFT, and the challenges of non-IID data.
Key contributions include:
1. A novel taxonomy that identifies synergies between FL methods for FMs and efficient communication methods.
2. A holistic evaluation of existing FL computational and communication efficiency methods.
3. Discussion on future research directions, emphasizing the importance of improving computational and communication efficiency for FMs in FL applications.This paper presents a comprehensive survey on efficient Federated Learning (FL) methods for training Foundation Models (FMs). FMs, which have gained significant traction in the deep learning community, are pre-trained on extensive datasets and can be fine-tuned for specific downstream tasks. However, the challenge lies in accessing these large datasets due to privacy concerns. FL addresses this by enabling collaborative training across decentralized clients without sharing raw data, making it suitable for FMs.
The authors introduce a novel taxonomy focused on computational and communication efficiency, which are crucial for making FL effective with FMs. They discuss various methods for computational efficiency, including full model training, parameter-efficient fine-tuning (PEFT), prompt tuning, and instruction tuning. For communication efficiency, they explore model pruning and full model compression techniques such as quantization, sparsification, and gradient projection.
The paper evaluates existing FL frameworks for their readiness to handle FMs, noting that while some frameworks support PEFT, they lack efficient communication techniques. It also highlights the need for future research on evaluating generative models in FL, the interplay between privacy and PEFT, and the challenges of non-IID data.
Key contributions include:
1. A novel taxonomy that identifies synergies between FL methods for FMs and efficient communication methods.
2. A holistic evaluation of existing FL computational and communication efficiency methods.
3. Discussion on future research directions, emphasizing the importance of improving computational and communication efficiency for FMs in FL applications.