20 Nov 2024 | Jing Wu, Trung Le, Munawar Hayat, Mehrtash Harandi
EraseDiff is an algorithm designed to remove undesirable information from diffusion models while preserving their utility. The method formulates the task as a constrained optimization problem using the value function, resulting in a natural first-order algorithm for solving the optimization problem. By altering the generative process to deviate from the ground-truth denoising trajectory, EraseDiff updates parameters for preservation while controlling constraint reduction to ensure effective erasure, striking an optimal trade-off. Extensive experiments and thorough comparisons with state-of-the-art algorithms demonstrate that EraseDiff effectively preserves the model's utility, efficacy, and efficiency.
Diffusion models are highly effective at generating high-quality images but pose risks, such as the unintentional generation of NSFW (not safe for work) content. Although various techniques have been proposed to mitigate unwanted influences in diffusion models while preserving overall performance, achieving a balance between these goals remains challenging. In this work, we introduce EraseDiff, an algorithm designed to preserve the utility of the diffusion model on retained data while removing the unwanted information associated with the data to be forgotten. Our approach formulates this task as a constrained optimization problem using the value function, resulting in a natural first-order algorithm for solving the optimization problem. By altering the generative process to deviate away from the ground-truth denoising trajectory, we update parameters for preservation while controlling constraint reduction to ensure effective erasure, striking an optimal trade-off. Extensive experiments and thorough comparisons with state-of-the-art algorithms demonstrate that EraseDiff effectively preserves the model's utility, efficacy, and efficiency.
EraseDiff is evaluated on various scenarios, including removing images with specific classes/concepts, to answer the following research questions: (i) Can typical machine unlearning methods be applied to diffusion models? (ii) Is EraseDiff able to remove the influence of Df in the diffusion models? (iii) Is EraseDiff able to preserve the model utility while removing Df? (iv) Is EraseDiff efficient in removing the data? (v) How does EraseDiff perform on the public well-trained models?
Results show that EraseDiff outperforms existing methods in terms of both performance and efficiency. It achieves a fine-tuned balance between preservation and targeted erasure, yielding an optimal trade-off. EraseDiff is 11× faster than Heng and Soh's method and 2× faster than Fan's method when forgetting on DDPM while achieving better unlearning results across several metrics. The results demonstrate that EraseDiff is capable of effectively erasing data influence in diffusion models, ranging from specific classes to the concept of nudity.EraseDiff is an algorithm designed to remove undesirable information from diffusion models while preserving their utility. The method formulates the task as a constrained optimization problem using the value function, resulting in a natural first-order algorithm for solving the optimization problem. By altering the generative process to deviate from the ground-truth denoising trajectory, EraseDiff updates parameters for preservation while controlling constraint reduction to ensure effective erasure, striking an optimal trade-off. Extensive experiments and thorough comparisons with state-of-the-art algorithms demonstrate that EraseDiff effectively preserves the model's utility, efficacy, and efficiency.
Diffusion models are highly effective at generating high-quality images but pose risks, such as the unintentional generation of NSFW (not safe for work) content. Although various techniques have been proposed to mitigate unwanted influences in diffusion models while preserving overall performance, achieving a balance between these goals remains challenging. In this work, we introduce EraseDiff, an algorithm designed to preserve the utility of the diffusion model on retained data while removing the unwanted information associated with the data to be forgotten. Our approach formulates this task as a constrained optimization problem using the value function, resulting in a natural first-order algorithm for solving the optimization problem. By altering the generative process to deviate away from the ground-truth denoising trajectory, we update parameters for preservation while controlling constraint reduction to ensure effective erasure, striking an optimal trade-off. Extensive experiments and thorough comparisons with state-of-the-art algorithms demonstrate that EraseDiff effectively preserves the model's utility, efficacy, and efficiency.
EraseDiff is evaluated on various scenarios, including removing images with specific classes/concepts, to answer the following research questions: (i) Can typical machine unlearning methods be applied to diffusion models? (ii) Is EraseDiff able to remove the influence of Df in the diffusion models? (iii) Is EraseDiff able to preserve the model utility while removing Df? (iv) Is EraseDiff efficient in removing the data? (v) How does EraseDiff perform on the public well-trained models?
Results show that EraseDiff outperforms existing methods in terms of both performance and efficiency. It achieves a fine-tuned balance between preservation and targeted erasure, yielding an optimal trade-off. EraseDiff is 11× faster than Heng and Soh's method and 2× faster than Fan's method when forgetting on DDPM while achieving better unlearning results across several metrics. The results demonstrate that EraseDiff is capable of effectively erasing data influence in diffusion models, ranging from specific classes to the concept of nudity.