14 Feb 2019 | Chaowei Xiao1 *, Bo Li2, Jun-Yan Zhu2,3, Warren He2, Mingyan Liu1 and Dawn Song2
This paper proposes AdvGAN, a method to generate adversarial examples using generative adversarial networks (GANs). AdvGAN can learn and approximate the distribution of original instances, enabling efficient generation of perturbations for any input instance. It is applied in both semi-whitebox and black-box attack settings. In semi-whitebox attacks, the generator can produce adversarial perturbations without needing access to the target model. In black-box attacks, a distilled model is dynamically trained to optimize the generator. AdvGAN generates adversarial examples with high attack success rates against state-of-the-art defenses, achieving 92.76% accuracy on a public MNIST black-box attack challenge.
AdvGAN uses a generator to create adversarial examples and a discriminator to ensure realism. It trains a feed-forward network to generate perturbations and a discriminator to ensure the generated examples are realistic. The method is more efficient than traditional optimization-based methods and produces more perceptually realistic adversarial examples. AdvGAN can attack black-box models by training a distilled model, achieving high attack success rates and targeted black-box attacks.
The paper evaluates AdvGAN's effectiveness in both semi-whitebox and black-box settings. It shows that AdvGAN achieves higher attack success rates compared to other methods. The method is also tested on high-resolution images, demonstrating its ability to generate realistic adversarial examples. AdvGAN's ability to generate high-quality adversarial examples makes it a promising candidate for improving adversarial training defense methods.This paper proposes AdvGAN, a method to generate adversarial examples using generative adversarial networks (GANs). AdvGAN can learn and approximate the distribution of original instances, enabling efficient generation of perturbations for any input instance. It is applied in both semi-whitebox and black-box attack settings. In semi-whitebox attacks, the generator can produce adversarial perturbations without needing access to the target model. In black-box attacks, a distilled model is dynamically trained to optimize the generator. AdvGAN generates adversarial examples with high attack success rates against state-of-the-art defenses, achieving 92.76% accuracy on a public MNIST black-box attack challenge.
AdvGAN uses a generator to create adversarial examples and a discriminator to ensure realism. It trains a feed-forward network to generate perturbations and a discriminator to ensure the generated examples are realistic. The method is more efficient than traditional optimization-based methods and produces more perceptually realistic adversarial examples. AdvGAN can attack black-box models by training a distilled model, achieving high attack success rates and targeted black-box attacks.
The paper evaluates AdvGAN's effectiveness in both semi-whitebox and black-box settings. It shows that AdvGAN achieves higher attack success rates compared to other methods. The method is also tested on high-resolution images, demonstrating its ability to generate realistic adversarial examples. AdvGAN's ability to generate high-quality adversarial examples makes it a promising candidate for improving adversarial training defense methods.