October 2011 | Ling Huang, Anthony D. Joseph, Blaine Nelson, Benjamin I. P. Rubinstein, J. D. Tygar
This paper discusses adversarial machine learning, an emerging field that studies effective machine learning techniques against an adversarial opponent. The paper introduces a taxonomy for classifying attacks against online machine learning algorithms, discusses application-specific factors that limit an adversary's capabilities, introduces two models for modeling an adversary's capabilities, explores the limits of an adversary's knowledge about the algorithm, feature space, training, and input data, explores vulnerabilities in machine learning algorithms, discusses countermeasures against attacks, introduces the evasion challenge, and discusses privacy-preserving learning techniques.
The paper presents a game-theoretic model of secure learning systems, where an attacker manipulates data to mis-train or evade a learning algorithm chosen by the defender. The game is formalized in terms of a learning algorithm H and the attacker's data corruption strategies. The defender chooses H to select hypotheses that predict well regardless of the attacker's actions, while the attacker chooses strategies to produce poor predictions. The taxonomy's axes determine the structure of the game and the legal moves each player can make.
The paper discusses causative attacks, where the adversary influences the training data. It presents case studies on SpamBayes and network anomaly detection, showing how adversaries can manipulate data to cause misclassifications. In the SpamBayes case study, the paper describes two types of causative availability attacks: an indiscriminate dictionary attack and a targeted focused attack. In the network anomaly detection case study, the paper describes how adversaries can poison the training data to cause the detector to fail in detecting DoS attacks.
The paper also discusses the limitations of the adversary's capabilities, including domain limitations, contrasting feature spaces, and contrasting data distributions. It models the attacker's capabilities and knowledge, and discusses how these factors affect the effectiveness of attacks. The paper also discusses the vulnerabilities of learning algorithms, including learning assumptions and data assumptions, and how these can be exploited by adversaries. Finally, the paper discusses iterative retraining and how it can be used to defend against adversarial attacks.This paper discusses adversarial machine learning, an emerging field that studies effective machine learning techniques against an adversarial opponent. The paper introduces a taxonomy for classifying attacks against online machine learning algorithms, discusses application-specific factors that limit an adversary's capabilities, introduces two models for modeling an adversary's capabilities, explores the limits of an adversary's knowledge about the algorithm, feature space, training, and input data, explores vulnerabilities in machine learning algorithms, discusses countermeasures against attacks, introduces the evasion challenge, and discusses privacy-preserving learning techniques.
The paper presents a game-theoretic model of secure learning systems, where an attacker manipulates data to mis-train or evade a learning algorithm chosen by the defender. The game is formalized in terms of a learning algorithm H and the attacker's data corruption strategies. The defender chooses H to select hypotheses that predict well regardless of the attacker's actions, while the attacker chooses strategies to produce poor predictions. The taxonomy's axes determine the structure of the game and the legal moves each player can make.
The paper discusses causative attacks, where the adversary influences the training data. It presents case studies on SpamBayes and network anomaly detection, showing how adversaries can manipulate data to cause misclassifications. In the SpamBayes case study, the paper describes two types of causative availability attacks: an indiscriminate dictionary attack and a targeted focused attack. In the network anomaly detection case study, the paper describes how adversaries can poison the training data to cause the detector to fail in detecting DoS attacks.
The paper also discusses the limitations of the adversary's capabilities, including domain limitations, contrasting feature spaces, and contrasting data distributions. It models the attacker's capabilities and knowledge, and discusses how these factors affect the effectiveness of attacks. The paper also discusses the vulnerabilities of learning algorithms, including learning assumptions and data assumptions, and how these can be exploited by adversaries. Finally, the paper discusses iterative retraining and how it can be used to defend against adversarial attacks.