22 Apr 2024 | Zhangheng Li, Junyuan Hong, Bo Li, Zhangyang Wang
This paper introduces a new privacy risk, Shake-to-Leak (S2L), where fine-tuning pre-trained diffusion models with manipulated data can amplify existing privacy risks. The study reveals that S2L can occur in various standard fine-tuning strategies, including concept-injection methods (DreamBooth, Textual Inversion) and parameter-efficient methods (LoRA, Hypernetwork), as well as their combinations. In the worst case, S2L can increase the AUC of membership inference attacks (MIA) by 5.4% and extract up to 15.8 private samples per target domain. The findings highlight that the privacy risk with diffusion models is more severe than previously recognized.
The S2L process involves generating a synthetic private set (SP Set) using a pre-trained diffusion model, then fine-tuning the model on this set. This process can amplify privacy leakage by making the model more susceptible to membership inference and data extraction attacks. The study demonstrates that S2L can be effective across various fine-tuning methods, including DreamBooth, Textual Inversion, LoRA, and Hypernetwork. The results show that S2L significantly increases the privacy risks associated with diffusion models, particularly when combined with parameter-efficient fine-tuning methods.
The paper also explores the impact of different fine-tuning strategies on privacy leakage. It finds that excluding the image encoder/decoder in fine-tuning can enhance privacy risks, while fine-tuning text embeddings is the most parameter-efficient method for amplifying privacy leakage. The study further shows that the number of fine-tuned parameters is highly related to S2L performance, with fewer parameters leading to higher privacy risks. The results indicate that S2L can be effective even with minimal prior knowledge of the private data distribution, as demonstrated by experiments on smaller models.
The paper concludes that S2L is a significant threat to the privacy of diffusion models, as it can amplify existing privacy risks through fine-tuning. The findings underscore the importance of developing robust defense strategies to mitigate the privacy risks associated with diffusion models. The study provides valuable insights into the privacy implications of fine-tuning techniques and highlights the need for further research into effective privacy-preserving methods for diffusion models.This paper introduces a new privacy risk, Shake-to-Leak (S2L), where fine-tuning pre-trained diffusion models with manipulated data can amplify existing privacy risks. The study reveals that S2L can occur in various standard fine-tuning strategies, including concept-injection methods (DreamBooth, Textual Inversion) and parameter-efficient methods (LoRA, Hypernetwork), as well as their combinations. In the worst case, S2L can increase the AUC of membership inference attacks (MIA) by 5.4% and extract up to 15.8 private samples per target domain. The findings highlight that the privacy risk with diffusion models is more severe than previously recognized.
The S2L process involves generating a synthetic private set (SP Set) using a pre-trained diffusion model, then fine-tuning the model on this set. This process can amplify privacy leakage by making the model more susceptible to membership inference and data extraction attacks. The study demonstrates that S2L can be effective across various fine-tuning methods, including DreamBooth, Textual Inversion, LoRA, and Hypernetwork. The results show that S2L significantly increases the privacy risks associated with diffusion models, particularly when combined with parameter-efficient fine-tuning methods.
The paper also explores the impact of different fine-tuning strategies on privacy leakage. It finds that excluding the image encoder/decoder in fine-tuning can enhance privacy risks, while fine-tuning text embeddings is the most parameter-efficient method for amplifying privacy leakage. The study further shows that the number of fine-tuned parameters is highly related to S2L performance, with fewer parameters leading to higher privacy risks. The results indicate that S2L can be effective even with minimal prior knowledge of the private data distribution, as demonstrated by experiments on smaller models.
The paper concludes that S2L is a significant threat to the privacy of diffusion models, as it can amplify existing privacy risks through fine-tuning. The findings underscore the importance of developing robust defense strategies to mitigate the privacy risks associated with diffusion models. The study provides valuable insights into the privacy implications of fine-tuning techniques and highlights the need for further research into effective privacy-preserving methods for diffusion models.