20 May 2010 | Marco Barreno · Blaine Nelson · Anthony D. Joseph · J.D. Tygar
This paper presents a taxonomy of attacks against machine learning systems and discusses how these attacks influence the costs for both attackers and defenders. The authors analyze the security of machine learning systems in adversarial environments, where attackers can exploit the adaptive nature of these systems to cause errors in classification. They propose a framework for evaluating secure learning, which includes identifying different classes of attacks, analyzing the resilience of existing systems, and investigating potential defenses.
The paper categorizes attacks into three dimensions: influence (causative vs. exploratory), security violation (integrity vs. availability), and specificity (targeted vs. indiscriminate). These dimensions define eight distinct classes of attacks on machine learning systems. The authors illustrate their taxonomy by analyzing attacks against the SpamBayes spam filter and discuss how their framework can be used to develop more robust secure learning systems.
The paper also discusses practical considerations for attacks, such as the need for valid executables, spoofing normal behavior, and the challenges of generating attack traffic that appears statistically identical to benign traffic. The authors propose various defense strategies, including feature selection, obfuscation of spam-indicating words, and cost-sensitive classification. They also discuss the importance of ensuring that learners use features related to the intrusion itself and the need to balance hypothesis complexity with the capacity to generalize.
The paper concludes that the proposed framework provides a common language for thinking and writing about secure learning and offers a foundation for developing highly robust secure learning systems. It emphasizes the importance of understanding the interaction between attackers and defenders in adversarial environments and the need for continuous evaluation and improvement of machine learning systems in such settings.This paper presents a taxonomy of attacks against machine learning systems and discusses how these attacks influence the costs for both attackers and defenders. The authors analyze the security of machine learning systems in adversarial environments, where attackers can exploit the adaptive nature of these systems to cause errors in classification. They propose a framework for evaluating secure learning, which includes identifying different classes of attacks, analyzing the resilience of existing systems, and investigating potential defenses.
The paper categorizes attacks into three dimensions: influence (causative vs. exploratory), security violation (integrity vs. availability), and specificity (targeted vs. indiscriminate). These dimensions define eight distinct classes of attacks on machine learning systems. The authors illustrate their taxonomy by analyzing attacks against the SpamBayes spam filter and discuss how their framework can be used to develop more robust secure learning systems.
The paper also discusses practical considerations for attacks, such as the need for valid executables, spoofing normal behavior, and the challenges of generating attack traffic that appears statistically identical to benign traffic. The authors propose various defense strategies, including feature selection, obfuscation of spam-indicating words, and cost-sensitive classification. They also discuss the importance of ensuring that learners use features related to the intrusion itself and the need to balance hypothesis complexity with the capacity to generalize.
The paper concludes that the proposed framework provides a common language for thinking and writing about secure learning and offers a foundation for developing highly robust secure learning systems. It emphasizes the importance of understanding the interaction between attackers and defenders in adversarial environments and the need for continuous evaluation and improvement of machine learning systems in such settings.