OpenFedLLM: Training Large Language Models on Decentralized Private Data via Federated Learning

OpenFedLLM: Training Large Language Models on Decentralized Private Data via Federated Learning

2024-02-10 | Rui Ye¹, Wenhao Wang²,³, Jingyi Chai¹, Dihan Li¹, Zexi Li², Yinda Xu¹, Yaxin Du¹, Yanfeng Wang³,¹, Siheng Chen¹,³
OpenFedLLM is a framework for training large language models (LLMs) on decentralized private data using federated learning (FL), enabling collaborative and privacy-preserving training without sharing raw data. The framework integrates seven representative FL algorithms, federated instruction tuning, and federated value alignment, supporting eight training datasets and over 30 evaluation metrics. It allows users to focus on either FL or LLMs without requiring deep knowledge of the other. OpenFedLLM is designed to be efficient, enabling training on a single consumer GPU (e.g., NVIDIA 3090) through techniques like quantization and parameter-efficient fine-tuning. The framework demonstrates that FL consistently outperforms local training across various settings, including financial benchmarks where FL-trained models outperform GPT-4. The framework also supports diverse domains, including finance, medical, code, and general tasks, and provides comprehensive evaluations. OpenFedLLM addresses challenges in data scarcity and privacy, offering a scalable solution for training LLMs on distributed private data. Future work includes exploring new FL algorithms tailored for LLMs, improving data management, and addressing heterogeneous preferences in value alignment. The framework contributes to bridging the gap between FL and LLM communities, enabling collaborative training while preserving data privacy.OpenFedLLM is a framework for training large language models (LLMs) on decentralized private data using federated learning (FL), enabling collaborative and privacy-preserving training without sharing raw data. The framework integrates seven representative FL algorithms, federated instruction tuning, and federated value alignment, supporting eight training datasets and over 30 evaluation metrics. It allows users to focus on either FL or LLMs without requiring deep knowledge of the other. OpenFedLLM is designed to be efficient, enabling training on a single consumer GPU (e.g., NVIDIA 3090) through techniques like quantization and parameter-efficient fine-tuning. The framework demonstrates that FL consistently outperforms local training across various settings, including financial benchmarks where FL-trained models outperform GPT-4. The framework also supports diverse domains, including finance, medical, code, and general tasks, and provides comprehensive evaluations. OpenFedLLM addresses challenges in data scarcity and privacy, offering a scalable solution for training LLMs on distributed private data. Future work includes exploring new FL algorithms tailored for LLMs, improving data management, and addressing heterogeneous preferences in value alignment. The framework contributes to bridging the gap between FL and LLM communities, enabling collaborative training while preserving data privacy.
Reach us at info@study.space
Understanding OpenFedLLM%3A Training Large Language Models on Decentralized Private Data via Federated Learning