11 Feb 2017 | Alexey Kurakin, Ian J. Goodfellow, Samy Bengio
This paper presents adversarial training applied to ImageNet, demonstrating how to scale adversarial training to large models and datasets. The authors show that adversarial training increases robustness to single-step attack methods but is less effective against multi-step attacks. They also observe that adversarial training can lead to a "label leaking" effect, where models perform better on adversarial examples than on clean ones due to the use of true labels during adversarial example construction. The study finds that models with higher capacity are more robust to adversarial examples. The authors also compare different adversarial example generation methods and find that "step l.l." and "step rnd." methods are most effective for adversarial training. They demonstrate that adversarial training can significantly improve robustness to adversarial examples, although it may slightly reduce accuracy on clean examples. The study also shows that adversarial examples generated by one-step methods are more transferable between models than those generated by iterative methods. The authors recommend using adversarial training with "step l.l." method for robustness against adversarial examples. They also find that increasing model capacity can improve robustness to adversarial examples. The study concludes that adversarial training is an effective defense against adversarial examples, although it may slightly reduce accuracy on clean examples.This paper presents adversarial training applied to ImageNet, demonstrating how to scale adversarial training to large models and datasets. The authors show that adversarial training increases robustness to single-step attack methods but is less effective against multi-step attacks. They also observe that adversarial training can lead to a "label leaking" effect, where models perform better on adversarial examples than on clean ones due to the use of true labels during adversarial example construction. The study finds that models with higher capacity are more robust to adversarial examples. The authors also compare different adversarial example generation methods and find that "step l.l." and "step rnd." methods are most effective for adversarial training. They demonstrate that adversarial training can significantly improve robustness to adversarial examples, although it may slightly reduce accuracy on clean examples. The study also shows that adversarial examples generated by one-step methods are more transferable between models than those generated by iterative methods. The authors recommend using adversarial training with "step l.l." method for robustness against adversarial examples. They also find that increasing model capacity can improve robustness to adversarial examples. The study concludes that adversarial training is an effective defense against adversarial examples, although it may slightly reduce accuracy on clean examples.