15 Dec 2017 | Xinyun Chen, Chang Liu, Bo Li, Kimberly Lu, Dawn Song
This paper introduces a new type of attack called *backdoor attacks* on deep learning systems, specifically focusing on *backdoor poisoning attacks*. These attacks aim to create a backdoor in a learning-based authentication system, allowing the attacker to bypass the system by leveraging a specific input instance. The authors study backdoor poisoning attacks under a weak threat model, where the attacker has no knowledge of the model or training set and can only inject a small number of poisoning samples. They propose two classes of poisoning attack strategies: *input-instance-key attacks* and *pattern-key attacks*. The former creates backdoor instances similar to a single input instance, while the latter creates backdoor instances sharing a pattern. Experiments on state-of-the-art face recognition models demonstrate that only a few poisoning samples (e.g., 5 for input-instance-key attacks, 50 for pattern-key attacks) are sufficient to achieve high attack success rates (over 90%) while maintaining high test accuracy on pristine data. The authors also show that the proposed attacks can result in physically implementable backdoors, highlighting the importance of developing defenses against such attacks.This paper introduces a new type of attack called *backdoor attacks* on deep learning systems, specifically focusing on *backdoor poisoning attacks*. These attacks aim to create a backdoor in a learning-based authentication system, allowing the attacker to bypass the system by leveraging a specific input instance. The authors study backdoor poisoning attacks under a weak threat model, where the attacker has no knowledge of the model or training set and can only inject a small number of poisoning samples. They propose two classes of poisoning attack strategies: *input-instance-key attacks* and *pattern-key attacks*. The former creates backdoor instances similar to a single input instance, while the latter creates backdoor instances sharing a pattern. Experiments on state-of-the-art face recognition models demonstrate that only a few poisoning samples (e.g., 5 for input-instance-key attacks, 50 for pattern-key attacks) are sufficient to achieve high attack success rates (over 90%) while maintaining high test accuracy on pristine data. The authors also show that the proposed attacks can result in physically implementable backdoors, highlighting the importance of developing defenses against such attacks.