[slides and audio] Federated Optimization%3A Distributed Machine Learning for On-Device Intelligence

The paper introduces a new setting for distributed optimization in machine learning, called *Federated Optimization*, where data is unevenly distributed across a large number of nodes, and the goal is to train a high-quality centralized model. The main challenge is to minimize communication rounds, as communication efficiency is crucial. The authors motivate this setting by considering training data stored locally on users' mobile devices, where each device performs computation on its local data to update a global model. They argue that existing algorithms are not suitable for this setting and propose a new algorithm that shows promising results for sparse convex problems. The paper also discusses the privacy and security implications of federated learning, suggesting that it can significantly reduce these risks by limiting the attack surface to the device rather than the cloud. The authors focus on the optimization problem and provide a detailed overview of relevant literature, including baseline algorithms like Gradient Descent, Stochastic Gradient Descent, and novel randomized algorithms such as Stochastic Dual Coordinate Ascent and Stochastic Variance Reduced Gradient. They review distributed optimization algorithms and communication-efficient algorithms, highlighting the limitations of existing methods in the context of federated optimization. Finally, they introduce their proposed algorithm, which is designed to handle the unique challenges of federated optimization, and show its effectiveness through experimental results.The paper introduces a new setting for distributed optimization in machine learning, called *Federated Optimization*, where data is unevenly distributed across a large number of nodes, and the goal is to train a high-quality centralized model. The main challenge is to minimize communication rounds, as communication efficiency is crucial. The authors motivate this setting by considering training data stored locally on users' mobile devices, where each device performs computation on its local data to update a global model. They argue that existing algorithms are not suitable for this setting and propose a new algorithm that shows promising results for sparse convex problems. The paper also discusses the privacy and security implications of federated learning, suggesting that it can significantly reduce these risks by limiting the attack surface to the device rather than the cloud. The authors focus on the optimization problem and provide a detailed overview of relevant literature, including baseline algorithms like Gradient Descent, Stochastic Gradient Descent, and novel randomized algorithms such as Stochastic Dual Coordinate Ascent and Stochastic Variance Reduced Gradient. They review distributed optimization algorithms and communication-efficient algorithms, highlighting the limitations of existing methods in the context of federated optimization. Finally, they introduce their proposed algorithm, which is designed to handle the unique challenges of federated optimization, and show its effectiveness through experimental results.

Federated Optimization: Distributed Machine Learning for On-Device Intelligence

October 11, 2016 | Jakub Konečný, H. Brendan McMahan, Daniel Ramage, Peter Richtárik