18 May 2018 | Pouya Samangouei*, Maya Kabkab*, and Rama Chellappa
Defense-GAN is a novel defense mechanism against adversarial attacks in classification tasks, leveraging generative models to protect deep neural networks. The method uses a Generative Adversarial Network (GAN) to model the distribution of unperturbed images. At inference time, it projects input images onto the GAN's generator range to find a close output that does not contain adversarial changes, which is then fed to the classifier. This approach is effective against both black-box and white-box attacks and does not require modifying the classifier structure or training procedure. Defense-GAN can be applied to any classification model and is not dependent on knowledge of the adversarial generation process. Empirical results show that Defense-GAN consistently improves upon existing defense strategies across different attack methods. The method is implemented using TensorFlow and is publicly available at https://github.com/kabkabm/defensegan. The paper discusses various attack models, defense mechanisms, and GANs, and presents experimental results on benchmark image datasets, demonstrating the effectiveness of Defense-GAN in reducing adversarial noise and improving classification accuracy. The method is also shown to be effective in detecting adversarial examples through the use of reconstruction error metrics.Defense-GAN is a novel defense mechanism against adversarial attacks in classification tasks, leveraging generative models to protect deep neural networks. The method uses a Generative Adversarial Network (GAN) to model the distribution of unperturbed images. At inference time, it projects input images onto the GAN's generator range to find a close output that does not contain adversarial changes, which is then fed to the classifier. This approach is effective against both black-box and white-box attacks and does not require modifying the classifier structure or training procedure. Defense-GAN can be applied to any classification model and is not dependent on knowledge of the adversarial generation process. Empirical results show that Defense-GAN consistently improves upon existing defense strategies across different attack methods. The method is implemented using TensorFlow and is publicly available at https://github.com/kabkabm/defensegan. The paper discusses various attack models, defense mechanisms, and GANs, and presents experimental results on benchmark image datasets, demonstrating the effectiveness of Defense-GAN in reducing adversarial noise and improving classification accuracy. The method is also shown to be effective in detecting adversarial examples through the use of reconstruction error metrics.