[slides] Corrective Machine Unlearning

The paper "Corrective Machine Unlearning" addresses the challenge of mitigating the adverse effects of manipulated data on machine learning models, particularly in the context of large-scale training datasets sourced from the internet. The authors formalize the concept of "Corrective Machine Unlearning" as a problem where the goal is to improve model accuracy on affected domains even with limited knowledge of the manipulated data. They highlight that traditional unlearning methods, which aim to achieve retrain indistinguishability, are not suitable for this context due to the different requirements of corrective unlearning. The paper evaluates several state-of-the-art unlearning methods, including Selective Synaptic Dampening (SSD), and finds that while SSD can effectively remove the effects of poisoning attacks with a small subset of identified samples, it fails in the Interclass Confusion setting, demonstrating the need for more robust corrective unlearning methods. The authors conclude by emphasizing the importance of developing better corrective unlearning methods and evaluating them across different manipulation types to address the challenges of data integrity in web-scale training.The paper "Corrective Machine Unlearning" addresses the challenge of mitigating the adverse effects of manipulated data on machine learning models, particularly in the context of large-scale training datasets sourced from the internet. The authors formalize the concept of "Corrective Machine Unlearning" as a problem where the goal is to improve model accuracy on affected domains even with limited knowledge of the manipulated data. They highlight that traditional unlearning methods, which aim to achieve retrain indistinguishability, are not suitable for this context due to the different requirements of corrective unlearning. The paper evaluates several state-of-the-art unlearning methods, including Selective Synaptic Dampening (SSD), and finds that while SSD can effectively remove the effects of poisoning attacks with a small subset of identified samples, it fails in the Interclass Confusion setting, demonstrating the need for more robust corrective unlearning methods. The authors conclude by emphasizing the importance of developing better corrective unlearning methods and evaluating them across different manipulation types to address the challenges of data integrity in web-scale training.

Corrective Machine Unlearning

21 Feb 2024 | Shashwat Goel*1, Ameya Prabhu*2,3, Philip Torr2, Ponnurangam Kumaraguru1, and Amartya Sanyal4

21 Feb 2024 | Shashwat Goel1, Ameya Prabhu2,3, Philip Torr2, Ponnurangam Kumaraguru1, and Amartya Sanyal4