16 Feb 2018 | Wieland Brendel*, Jonas Rauber* & Matthias Bethge
Decision-based adversarial attacks are a crucial category of attacks that target the final decision of a machine learning model, making them highly relevant for real-world applications where access to model internals is limited. Unlike gradient-based or score-based attacks, decision-based attacks do not rely on model-specific information such as gradients or confidence scores, and are more robust to common defenses like gradient masking and robust training. The Boundary Attack, introduced in this paper, is a decision-based attack that starts from a large adversarial perturbation and gradually reduces it while maintaining adversarial properties. It is conceptually simple, requires minimal hyperparameter tuning, and is competitive with the best gradient-based attacks in standard computer vision tasks like ImageNet. The attack is effective against models where only the final decision is accessible, such as autonomous cars and face recognition systems. The Boundary Attack was tested on two black-box models from Clarifai.com, demonstrating its practical applicability. The attack is also effective against defensive techniques like defensive distillation, which are commonly used to mask gradients. The paper highlights the importance of decision-based attacks in evaluating the robustness of machine learning models and raises concerns about the safety of deployed systems. The Boundary Attack is implemented as part of Foolbox, an open-source library for benchmarking model robustness. The results show that decision-based attacks can find adversarial perturbations of similar magnitude to gradient-based attacks, but with fewer iterations and without requiring gradient information. This makes them more efficient and practical for real-world applications. The paper also discusses the importance of decision-based attacks in assessing the robustness of machine learning models and highlights the need for further research in this area.Decision-based adversarial attacks are a crucial category of attacks that target the final decision of a machine learning model, making them highly relevant for real-world applications where access to model internals is limited. Unlike gradient-based or score-based attacks, decision-based attacks do not rely on model-specific information such as gradients or confidence scores, and are more robust to common defenses like gradient masking and robust training. The Boundary Attack, introduced in this paper, is a decision-based attack that starts from a large adversarial perturbation and gradually reduces it while maintaining adversarial properties. It is conceptually simple, requires minimal hyperparameter tuning, and is competitive with the best gradient-based attacks in standard computer vision tasks like ImageNet. The attack is effective against models where only the final decision is accessible, such as autonomous cars and face recognition systems. The Boundary Attack was tested on two black-box models from Clarifai.com, demonstrating its practical applicability. The attack is also effective against defensive techniques like defensive distillation, which are commonly used to mask gradients. The paper highlights the importance of decision-based attacks in evaluating the robustness of machine learning models and raises concerns about the safety of deployed systems. The Boundary Attack is implemented as part of Foolbox, an open-source library for benchmarking model robustness. The results show that decision-based attacks can find adversarial perturbations of similar magnitude to gradient-based attacks, but with fewer iterations and without requiring gradient information. This makes them more efficient and practical for real-world applications. The paper also discusses the importance of decision-based attacks in assessing the robustness of machine learning models and highlights the need for further research in this area.