1 Jun 2019 | Cihang Xie, Zhishuai Zhang, Yuyin Zhou, Song Bai, Jianyu Wang, Zhou Ren, Alan Yuille
This paper proposes a method to improve the transferability of adversarial examples by introducing diverse input patterns. Adversarial examples are crafted by adding imperceptible perturbations to clean images, and they are vulnerable to attacks in both white-box and black-box settings. Traditional attacks often struggle in black-box settings, where the attacker has no knowledge of the model's structure. To address this, the authors propose DI²-FGSM, which applies random transformations to input images at each iteration, enhancing the transferability of adversarial examples across different networks.
The method is inspired by data augmentation strategies, which have been shown to prevent overfitting in neural networks. By applying random transformations such as resizing and padding, the method generates more diverse input patterns, making adversarial examples more robust to variations in network structures. This approach is integrated with iterative fast gradient sign methods (I-FGSM) and momentum-based variants (MI-FGSM), further improving the effectiveness of the attack.
Experiments on ImageNet show that DI²-FGSM significantly outperforms existing baselines in black-box settings, achieving an average success rate of 73.0%, which is 6.6% higher than the top-1 attack submission in the NIPS 2017 adversarial competition. The method also maintains high success rates in white-box settings. Additionally, the approach is tested against top defense solutions and official baselines, demonstrating its effectiveness in generating adversarial examples that transfer well across different models.
The paper also explores the integration of diverse input patterns with ensemble attacks, where adversarial examples are generated against multiple networks simultaneously. This further enhances the transferability of the examples, as they are more likely to fool a wider range of models. The results show that the proposed method significantly outperforms other attacks in both white-box and black-box settings.
The study highlights the importance of input diversity in improving the transferability of adversarial examples, suggesting that diverse input patterns can help generate more robust and transferable adversarial examples, which can be used to evaluate the robustness of neural networks and the effectiveness of defense mechanisms.This paper proposes a method to improve the transferability of adversarial examples by introducing diverse input patterns. Adversarial examples are crafted by adding imperceptible perturbations to clean images, and they are vulnerable to attacks in both white-box and black-box settings. Traditional attacks often struggle in black-box settings, where the attacker has no knowledge of the model's structure. To address this, the authors propose DI²-FGSM, which applies random transformations to input images at each iteration, enhancing the transferability of adversarial examples across different networks.
The method is inspired by data augmentation strategies, which have been shown to prevent overfitting in neural networks. By applying random transformations such as resizing and padding, the method generates more diverse input patterns, making adversarial examples more robust to variations in network structures. This approach is integrated with iterative fast gradient sign methods (I-FGSM) and momentum-based variants (MI-FGSM), further improving the effectiveness of the attack.
Experiments on ImageNet show that DI²-FGSM significantly outperforms existing baselines in black-box settings, achieving an average success rate of 73.0%, which is 6.6% higher than the top-1 attack submission in the NIPS 2017 adversarial competition. The method also maintains high success rates in white-box settings. Additionally, the approach is tested against top defense solutions and official baselines, demonstrating its effectiveness in generating adversarial examples that transfer well across different models.
The paper also explores the integration of diverse input patterns with ensemble attacks, where adversarial examples are generated against multiple networks simultaneously. This further enhances the transferability of the examples, as they are more likely to fool a wider range of models. The results show that the proposed method significantly outperforms other attacks in both white-box and black-box settings.
The study highlights the importance of input diversity in improving the transferability of adversarial examples, suggesting that diverse input patterns can help generate more robust and transferable adversarial examples, which can be used to evaluate the robustness of neural networks and the effectiveness of defense mechanisms.