This paper introduces a method to detect, explain, and mitigate memorization in diffusion models. Diffusion models have shown impressive image generation capabilities, but some outputs are merely copies of training data, raising legal and privacy concerns. The authors propose a detection method based on the magnitude of text-conditional predictions, which effectively identifies memorized prompts without disrupting sampling algorithms. This method achieves high accuracy even at the first generation step, with a single generation per prompt.
The authors also develop an explainable approach to identify the contribution of individual words or tokens to memorization, offering users an interactive way to adjust their prompts. Two mitigation strategies are proposed: one during inference by minimizing text-conditional prediction magnitudes, and another during training by filtering out memorized image-text pairs. These strategies effectively counteract memorization while maintaining high generation quality.
The paper also discusses related work, including membership inference attacks, training data extraction, and diffusion memorization mitigation. The authors evaluate their detection method on a dataset of 500 memorized prompts, achieving high precision and efficiency. They also test their mitigation strategies, showing that they effectively reduce memorization while maintaining generation quality.
The paper concludes that their approach provides a practical solution to the problem of memorization in diffusion models, offering both detection and mitigation strategies that are efficient and effective. The code for the proposed methods is available at https://github.com/YuxinWenRick/diffusion_memorization.This paper introduces a method to detect, explain, and mitigate memorization in diffusion models. Diffusion models have shown impressive image generation capabilities, but some outputs are merely copies of training data, raising legal and privacy concerns. The authors propose a detection method based on the magnitude of text-conditional predictions, which effectively identifies memorized prompts without disrupting sampling algorithms. This method achieves high accuracy even at the first generation step, with a single generation per prompt.
The authors also develop an explainable approach to identify the contribution of individual words or tokens to memorization, offering users an interactive way to adjust their prompts. Two mitigation strategies are proposed: one during inference by minimizing text-conditional prediction magnitudes, and another during training by filtering out memorized image-text pairs. These strategies effectively counteract memorization while maintaining high generation quality.
The paper also discusses related work, including membership inference attacks, training data extraction, and diffusion memorization mitigation. The authors evaluate their detection method on a dataset of 500 memorized prompts, achieving high precision and efficiency. They also test their mitigation strategies, showing that they effectively reduce memorization while maintaining generation quality.
The paper concludes that their approach provides a practical solution to the problem of memorization in diffusion models, offering both detection and mitigation strategies that are efficient and effective. The code for the proposed methods is available at https://github.com/YuxinWenRick/diffusion_memorization.