[slides] AquaLoRA%3A Toward White-box Protection for Customized Stable Diffusion Models via Watermark LoRA

**AquaLoRA: Toward White-box Protection for Customized Stable Diffusion Models via Watermark LoRA** **Abstract:** Diffusion models, particularly Stable Diffusion (SD), have achieved significant success in generating high-quality images. However, the widespread availability of customized SD models has raised copyright concerns, such as unauthorized distribution and commercial use. To address these issues, recent works have proposed watermarking techniques to ensure post-hoc forensics. However, none of these methods can achieve white-box protection, where malicious users can easily remove or replace the watermarking module. This paper introduces AquaLoRA, the first implementation of white-box protection for customized SD models. AquaLoRA integrates watermark information into the U-Net of SD models through a two-stage process. In the first stage, a watermark Low-Rank Adaptation (LoRA) module is used to merge watermark information into the U-Net in a flexible manner without retraining. The second stage introduces Prior Preserving Fine-Tuning (PPFT) to ensure minimal impact on model distribution while learning the watermark. Extensive experiments and ablation studies validate the effectiveness of AquaLoRA, demonstrating its robustness against various distortions and sampling configurations. **Introduction:** Stable Diffusion models, known for their open-source nature and powerful generative capabilities, have fostered a vibrant community of creators and enthusiasts. However, the ease of sharing and customization has led to copyright concerns. Current watermarking methods, such as image watermarking and integrated watermarking, are often ineffective in white-box scenarios, where adversaries can easily bypass the watermarking process. AquaLoRA addresses this by embedding the watermark directly into the U-Net, ensuring that disrupting the watermarking module significantly impacts generation fidelity. **Preliminaries:** The paper discusses the latent diffusion model and the Low-Rank Adaptation (LoRA) method, which is crucial for integrating watermark information into the U-Net. **AquaLoRA:** - **Overview:** AquaLoRA consists of two stages: latent watermark pre-training and prior preserving fine-tuning. The first stage trains a watermark pattern suitable for the U-Net, while the second stage integrates this pattern into the U-Net using LoRA. - **Latent Watermark Pre-training:** This stage trains a watermark secret encoder and decoder in the latent space to ensure the watermark is prominent and cover-agnostic. - **Watermark Learning with Prior Preserving:** This stage uses PPFT to learn the watermark pattern in the U-Net while minimizing changes to the original model's distribution. - **Coarse Type Adaptation:** This stage fine-tunes AquaLoRA on different coarse types to enhance performance in customized models. **Experiments:** - **Fidelity:** AquaLoRA outperforms other methods in terms of image quality and semantic similarity. - **Robustness:** The method demonstrates strong resilience to various distortions and sampling configurations. - **Ablation**AquaLoRA: Toward White-box Protection for Customized Stable Diffusion Models via Watermark LoRA** **Abstract:** Diffusion models, particularly Stable Diffusion (SD), have achieved significant success in generating high-quality images. However, the widespread availability of customized SD models has raised copyright concerns, such as unauthorized distribution and commercial use. To address these issues, recent works have proposed watermarking techniques to ensure post-hoc forensics. However, none of these methods can achieve white-box protection, where malicious users can easily remove or replace the watermarking module. This paper introduces AquaLoRA, the first implementation of white-box protection for customized SD models. AquaLoRA integrates watermark information into the U-Net of SD models through a two-stage process. In the first stage, a watermark Low-Rank Adaptation (LoRA) module is used to merge watermark information into the U-Net in a flexible manner without retraining. The second stage introduces Prior Preserving Fine-Tuning (PPFT) to ensure minimal impact on model distribution while learning the watermark. Extensive experiments and ablation studies validate the effectiveness of AquaLoRA, demonstrating its robustness against various distortions and sampling configurations. **Introduction:** Stable Diffusion models, known for their open-source nature and powerful generative capabilities, have fostered a vibrant community of creators and enthusiasts. However, the ease of sharing and customization has led to copyright concerns. Current watermarking methods, such as image watermarking and integrated watermarking, are often ineffective in white-box scenarios, where adversaries can easily bypass the watermarking process. AquaLoRA addresses this by embedding the watermark directly into the U-Net, ensuring that disrupting the watermarking module significantly impacts generation fidelity. **Preliminaries:** The paper discusses the latent diffusion model and the Low-Rank Adaptation (LoRA) method, which is crucial for integrating watermark information into the U-Net. **AquaLoRA:** - **Overview:** AquaLoRA consists of two stages: latent watermark pre-training and prior preserving fine-tuning. The first stage trains a watermark pattern suitable for the U-Net, while the second stage integrates this pattern into the U-Net using LoRA. - **Latent Watermark Pre-training:** This stage trains a watermark secret encoder and decoder in the latent space to ensure the watermark is prominent and cover-agnostic. - **Watermark Learning with Prior Preserving:** This stage uses PPFT to learn the watermark pattern in the U-Net while minimizing changes to the original model's distribution. - **Coarse Type Adaptation:** This stage fine-tunes AquaLoRA on different coarse types to enhance performance in customized models. **Experiments:** - **Fidelity:** AquaLoRA outperforms other methods in terms of image quality and semantic similarity. - **Robustness:** The method demonstrates strong resilience to various distortions and sampling configurations. - **Ablation

AquaLoRA: Toward White-box Protection for Customized Stable Diffusion Models via Watermark LoRA

18 May 2024 | Weitao Feng * 1 Wenbo Zhou * 1 Jijian He 1 Jie Zhang 1 2 Tianyi Wei 1 Guanlin Li 2 Tianwei Zhang 2 Weiming Zhang 1 Nenghai Yu 1