Communication-Efficient Learning of Deep Networks from Decentralized Data

Communication-Efficient Learning of Deep Networks from Decentralized Data

26 Jan 2023 | H. Brendan McMahan, Eider Moore, Daniel Ramage, Seth Hampson, Blaise Agüera y Arcas
This paper introduces Federated Learning, a decentralized approach to train deep networks using data distributed across mobile devices. Unlike traditional methods that require centralized data storage, Federated Learning allows models to be trained on data that remains on the device, with only model updates being shared with a central server. This approach enhances privacy and reduces communication costs, as it avoids transmitting raw data. The paper presents FederatedAveraging, an algorithm that combines local stochastic gradient descent on each device with server-side model averaging. It demonstrates that this method is robust to unbalanced and non-IID data distributions, which are common in decentralized settings. Communication costs are the primary constraint, and the algorithm reduces the number of communication rounds needed to train a deep network by up to 100× compared to synchronized stochastic gradient descent. The paper evaluates FederatedAveraging on various tasks, including image classification and language modeling. For image classification, it uses the MNIST dataset, partitioned into IID and non-IID settings. For language modeling, it uses a dataset derived from Shakespeare's works, partitioned by role in the plays. The algorithm performs well on both tasks, achieving high accuracy with fewer communication rounds. The paper also discusses the privacy benefits of Federated Learning, noting that it minimizes the risk of user data exposure by only sharing model updates. It contrasts Federated Learning with traditional data center training, highlighting the importance of communication efficiency in decentralized settings. The paper concludes that Federated Learning is a practical approach for training deep networks on decentralized data, offering significant improvements in privacy, communication efficiency, and model performance. Future work includes exploring stronger privacy guarantees through differential privacy and secure multi-party computation.This paper introduces Federated Learning, a decentralized approach to train deep networks using data distributed across mobile devices. Unlike traditional methods that require centralized data storage, Federated Learning allows models to be trained on data that remains on the device, with only model updates being shared with a central server. This approach enhances privacy and reduces communication costs, as it avoids transmitting raw data. The paper presents FederatedAveraging, an algorithm that combines local stochastic gradient descent on each device with server-side model averaging. It demonstrates that this method is robust to unbalanced and non-IID data distributions, which are common in decentralized settings. Communication costs are the primary constraint, and the algorithm reduces the number of communication rounds needed to train a deep network by up to 100× compared to synchronized stochastic gradient descent. The paper evaluates FederatedAveraging on various tasks, including image classification and language modeling. For image classification, it uses the MNIST dataset, partitioned into IID and non-IID settings. For language modeling, it uses a dataset derived from Shakespeare's works, partitioned by role in the plays. The algorithm performs well on both tasks, achieving high accuracy with fewer communication rounds. The paper also discusses the privacy benefits of Federated Learning, noting that it minimizes the risk of user data exposure by only sharing model updates. It contrasts Federated Learning with traditional data center training, highlighting the importance of communication efficiency in decentralized settings. The paper concludes that Federated Learning is a practical approach for training deep networks on decentralized data, offering significant improvements in privacy, communication efficiency, and model performance. Future work includes exploring stronger privacy guarantees through differential privacy and secure multi-party computation.
Reach us at info@study.space
[slides and audio] Communication-Efficient Learning of Deep Networks from Decentralized Data