Watermark-embedded Adversarial Examples for Copyright Protection against Diffusion Models

Watermark-embedded Adversarial Examples for Copyright Protection against Diffusion Models

19 Apr 2024 | Peifei Zhu, Tsubasa Takahashi, Hirokatsu Kataoka
This paper proposes a novel method for copyright protection against diffusion models (DMs) by embedding personal watermarks into adversarial examples. The method generates adversarial examples that force DMs to generate images with visible watermarks, thereby preventing unauthorized imitation. The approach involves training a generator based on conditional adversarial networks with three losses: adversarial loss, GAN loss, and perturbation loss. This allows the generator to produce adversarial examples with subtle perturbations that effectively attack DMs while keeping the perturbations invisible to the human eye. The generator can be trained with as few as 5-10 samples in 2-3 minutes and can generate adversarial examples at a speed of 0.2 seconds per image. The method is tested in various image generation scenarios and shows good transferability across different generative models. Compared to existing methods, this approach adds visible watermarks to generated images, making it a more straightforward way to indicate copyright violations. The method also demonstrates robustness against adversarial defenses and good performance in both text-guided image-to-image generation and textual inversion scenarios. The results show that the proposed method is effective in preventing unauthorized image generation by DMs and provides a simple yet powerful way to protect image copyright.This paper proposes a novel method for copyright protection against diffusion models (DMs) by embedding personal watermarks into adversarial examples. The method generates adversarial examples that force DMs to generate images with visible watermarks, thereby preventing unauthorized imitation. The approach involves training a generator based on conditional adversarial networks with three losses: adversarial loss, GAN loss, and perturbation loss. This allows the generator to produce adversarial examples with subtle perturbations that effectively attack DMs while keeping the perturbations invisible to the human eye. The generator can be trained with as few as 5-10 samples in 2-3 minutes and can generate adversarial examples at a speed of 0.2 seconds per image. The method is tested in various image generation scenarios and shows good transferability across different generative models. Compared to existing methods, this approach adds visible watermarks to generated images, making it a more straightforward way to indicate copyright violations. The method also demonstrates robustness against adversarial defenses and good performance in both text-guided image-to-image generation and textual inversion scenarios. The results show that the proposed method is effective in preventing unauthorized image generation by DMs and provides a simple yet powerful way to protect image copyright.
Reach us at info@study.space