[slides and audio] BadNets%3A Identifying Vulnerabilities in the Machine Learning Model Supply Chain

The paper "BadNets: Identifying Vulnerabilities in the Machine Learning Model Supply Chain" by Tianyu Gu explores the security risks associated with outsourced training of machine learning models. The authors demonstrate that maliciously trained neural networks, known as BadNets, can be created to perform well on standard inputs but misbehave on specific inputs chosen by the attacker. These BadNets are stealthy and can pass standard validation tests, making them difficult to detect. The paper presents two main attack scenarios: outsourced training and transfer learning. In the outsourced training scenario, an attacker can create a BadNet by poisoning the training dataset. In the transfer learning scenario, a BadNet trained on one dataset can be used to adapt to a new dataset, causing a significant drop in accuracy when encountering backdoor triggers. The authors demonstrate these attacks using a handwritten digit classifier and a traffic sign detector, showing that the BadNet can misclassify stop signs as speed limits when a specific sticker is added. They also evaluate the Caffe Model Zoo, a popular source of pre-trained models, and find vulnerabilities that could allow attackers to introduce backdoors. The paper concludes with recommendations for securing pre-trained models and emphasizes the need for better techniques to detect and prevent such attacks.The paper "BadNets: Identifying Vulnerabilities in the Machine Learning Model Supply Chain" by Tianyu Gu explores the security risks associated with outsourced training of machine learning models. The authors demonstrate that maliciously trained neural networks, known as BadNets, can be created to perform well on standard inputs but misbehave on specific inputs chosen by the attacker. These BadNets are stealthy and can pass standard validation tests, making them difficult to detect. The paper presents two main attack scenarios: outsourced training and transfer learning. In the outsourced training scenario, an attacker can create a BadNet by poisoning the training dataset. In the transfer learning scenario, a BadNet trained on one dataset can be used to adapt to a new dataset, causing a significant drop in accuracy when encountering backdoor triggers. The authors demonstrate these attacks using a handwritten digit classifier and a traffic sign detector, showing that the BadNet can misclassify stop signs as speed limits when a specific sticker is added. They also evaluate the Caffe Model Zoo, a popular source of pre-trained models, and find vulnerabilities that could allow attackers to introduce backdoors. The paper concludes with recommendations for securing pre-trained models and emphasizes the need for better techniques to detect and prevent such attacks.

BadNets: Identifying Vulnerabilities in the Machine Learning Model Supply Chain

11 Mar 2019 | Tianyu Gu, Brendan Dolan-Gavitt, Siddharth Garg