Deep Leakage from Gradients

Deep Leakage from Gradients

19 Dec 2019 | Ligeng Zhu, Zhijian Liu, Song Han
The paper "Deep Leakage from Gradients" by Ligeng Zhu, Zhijian Liu, and Song Han from MIT addresses the critical issue of gradient sharing in multi-node machine learning systems, such as distributed training and collaborative learning. The authors demonstrate that gradients, which are typically believed to be safe for sharing, can leak private training data. They introduce a method called *Deep Leakage from Gradients* (DLG), which can recover both pixel-wise accurate images and token-wise matching texts from publicly shared gradients. This attack is more powerful than previous methods, which often require additional information or produce synthetic alternatives. DLG works by optimizing dummy inputs and labels to minimize the distance between the dummy gradients and the real gradients. The authors evaluate their method on various datasets and tasks, showing that it can fully recover training data in just a few gradient steps. They also discuss several defense strategies, including gradient perturbation, low precision, and gradient compression, and find that the most effective defense is gradient pruning, which reduces the sparsity of gradients to around 20%. The paper highlights the severe challenges posed by deep leakage to multi-node machine learning systems and calls for a reevaluation of the safety of gradient sharing schemes. The authors aim to raise awareness about the security risks and encourage the development of more robust defenses.The paper "Deep Leakage from Gradients" by Ligeng Zhu, Zhijian Liu, and Song Han from MIT addresses the critical issue of gradient sharing in multi-node machine learning systems, such as distributed training and collaborative learning. The authors demonstrate that gradients, which are typically believed to be safe for sharing, can leak private training data. They introduce a method called *Deep Leakage from Gradients* (DLG), which can recover both pixel-wise accurate images and token-wise matching texts from publicly shared gradients. This attack is more powerful than previous methods, which often require additional information or produce synthetic alternatives. DLG works by optimizing dummy inputs and labels to minimize the distance between the dummy gradients and the real gradients. The authors evaluate their method on various datasets and tasks, showing that it can fully recover training data in just a few gradient steps. They also discuss several defense strategies, including gradient perturbation, low precision, and gradient compression, and find that the most effective defense is gradient pruning, which reduces the sparsity of gradients to around 20%. The paper highlights the severe challenges posed by deep leakage to multi-node machine learning systems and calls for a reevaluation of the safety of gradient sharing schemes. The authors aim to raise awareness about the security risks and encourage the development of more robust defenses.
Reach us at info@study.space