[slides and audio] Delving into Transferable Adversarial Examples and Black-box Attacks

This paper explores the transferability of adversarial examples between different deep neural network models, focusing on both non-targeted and targeted adversarial examples. The authors conduct a comprehensive study using large-scale datasets and models trained on ImageNet, which is a significant advancement over previous work that primarily used smaller datasets. They find that while non-targeted adversarial examples are relatively easy to generate and can transfer well between models, targeted adversarial examples generated using existing methods rarely transfer with their target labels. To address this, the authors propose ensemble-based approaches to generate transferable adversarial examples, which show promising results in generating a large proportion of targeted adversarial examples that can transfer. Additionally, they present geometric studies to understand the properties of transferable adversarial examples and demonstrate that these adversarial examples can successfully attack Clarifai.com, a black-box image classification system. The paper contributes to the understanding of adversarial examples and their potential for black-box attacks, providing insights into the geometric properties of models and the effectiveness of ensemble-based approaches.This paper explores the transferability of adversarial examples between different deep neural network models, focusing on both non-targeted and targeted adversarial examples. The authors conduct a comprehensive study using large-scale datasets and models trained on ImageNet, which is a significant advancement over previous work that primarily used smaller datasets. They find that while non-targeted adversarial examples are relatively easy to generate and can transfer well between models, targeted adversarial examples generated using existing methods rarely transfer with their target labels. To address this, the authors propose ensemble-based approaches to generate transferable adversarial examples, which show promising results in generating a large proportion of targeted adversarial examples that can transfer. Additionally, they present geometric studies to understand the properties of transferable adversarial examples and demonstrate that these adversarial examples can successfully attack Clarifai.com, a black-box image classification system. The paper contributes to the understanding of adversarial examples and their potential for black-box attacks, providing insights into the geometric properties of models and the effectiveness of ensemble-based approaches.

DELVING INTO TRANSFERABLE ADVERSARIAL EXAMPLES AND BLACK-BOX ATTACKS

7 Feb 2017 | Yanpei Liu, Xinyun Chen, Chang Liu, Dawn Song

DELVING INTO TRANSFERABLE ADVERSARIAL EXAMPLES AND BLACK-BOX ATTACKS

7 Feb 2017 | Yanpei Liu*, Xinyun Chen*, Chang Liu, Dawn Song

7 Feb 2017 | Yanpei Liu, Xinyun Chen, Chang Liu, Dawn Song