Fine-Pruning: Defending Against Backdooring Attacks on Deep Neural Networks

Fine-Pruning: Defending Against Backdooring Attacks on Deep Neural Networks

30 May 2018 | Kang Liu, Brendan Dolan-Gavitt, and Siddharth Garg
This paper presents the first effective defense against backdoor attacks on deep neural networks (DNNs). The authors implement three backdoor attacks from prior work and evaluate two promising defenses: pruning and fine-tuning. They find that neither defense alone is sufficient to protect against sophisticated attackers. They then introduce a combined defense called fine-pruning, which combines pruning and fine-tuning. The results show that fine-pruning successfully weakens or eliminates backdoors, reducing attack success rates to 0% in some cases with only a small drop in accuracy on clean inputs. The paper also introduces a new pruning-aware backdoor attack that evades the pruning defense by concentrating clean and backdoor behavior on the same set of neurons. The authors evaluate fine-pruning on three backdoor attacks (face, speech, and traffic sign recognition) and find it effective in disabling backdoors. The paper also discusses the threat model, backdoor attacks, and the effectiveness of pruning and fine-tuning defenses. The authors conclude that fine-pruning is a promising first step toward safe outsourced training for DNNs.This paper presents the first effective defense against backdoor attacks on deep neural networks (DNNs). The authors implement three backdoor attacks from prior work and evaluate two promising defenses: pruning and fine-tuning. They find that neither defense alone is sufficient to protect against sophisticated attackers. They then introduce a combined defense called fine-pruning, which combines pruning and fine-tuning. The results show that fine-pruning successfully weakens or eliminates backdoors, reducing attack success rates to 0% in some cases with only a small drop in accuracy on clean inputs. The paper also introduces a new pruning-aware backdoor attack that evades the pruning defense by concentrating clean and backdoor behavior on the same set of neurons. The authors evaluate fine-pruning on three backdoor attacks (face, speech, and traffic sign recognition) and find it effective in disabling backdoors. The paper also discusses the threat model, backdoor attacks, and the effectiveness of pruning and fine-tuning defenses. The authors conclude that fine-pruning is a promising first step toward safe outsourced training for DNNs.
Reach us at info@study.space
[slides and audio] Fine-Pruning%3A Defending Against Backdooring Attacks on Deep Neural Networks