ADVERSARIAL EXAMPLES IN THE PHYSICAL WORLD

ADVERSARIAL EXAMPLES IN THE PHYSICAL WORLD

11 Feb 2017 | Alexey Kurakin, Ian J. Goodfellow, Samy Bengio
This paper investigates the vulnerability of machine learning systems to adversarial examples in physical world scenarios. Adversarial examples are inputs that are slightly modified to cause a classifier to misclassify them, often in ways that are imperceptible to humans. The study demonstrates that even when adversarial examples are generated in the physical world and observed through a camera, they can still be misclassified by machine learning systems. The researchers used an ImageNet Inception classifier and tested adversarial images generated from a cell-phone camera, finding that a large fraction of these images were misclassified. The paper also explores different methods for generating adversarial examples, including fast, basic iterative, and least likely class methods. It shows that while the fast method is efficient, iterative methods can produce more effective adversarial examples. The study further examines the impact of various image transformations, such as printing, photographing, and cropping, on the effectiveness of adversarial examples. It finds that adversarial examples generated by the fast method are more robust to these transformations compared to iterative methods. The research also demonstrates a black box attack in the physical world, where adversarial examples are used to misclassify images without access to the underlying model. This was shown using a TensorFlow camera demo app, where adversarial images were misclassified even when viewed through a camera. The findings suggest that adversarial examples can be used to attack machine learning systems in physical environments, highlighting the need for robust defenses against such attacks.This paper investigates the vulnerability of machine learning systems to adversarial examples in physical world scenarios. Adversarial examples are inputs that are slightly modified to cause a classifier to misclassify them, often in ways that are imperceptible to humans. The study demonstrates that even when adversarial examples are generated in the physical world and observed through a camera, they can still be misclassified by machine learning systems. The researchers used an ImageNet Inception classifier and tested adversarial images generated from a cell-phone camera, finding that a large fraction of these images were misclassified. The paper also explores different methods for generating adversarial examples, including fast, basic iterative, and least likely class methods. It shows that while the fast method is efficient, iterative methods can produce more effective adversarial examples. The study further examines the impact of various image transformations, such as printing, photographing, and cropping, on the effectiveness of adversarial examples. It finds that adversarial examples generated by the fast method are more robust to these transformations compared to iterative methods. The research also demonstrates a black box attack in the physical world, where adversarial examples are used to misclassify images without access to the underlying model. This was shown using a TensorFlow camera demo app, where adversarial images were misclassified even when viewed through a camera. The findings suggest that adversarial examples can be used to attack machine learning systems in physical environments, highlighting the need for robust defenses against such attacks.
Reach us at info@study.space