CPR: Retrieval Augmented Generation for Copyright Protection

CPR: Retrieval Augmented Generation for Copyright Protection

27 Mar 2024 | Aditya Golatkar, Alessandro Achille, Luca Zancato, Yu-Xiang Wang, Ashwin Swaminathan, Stefano Soatto
The paper introduces Copy-Protected Generation with Retrieval (CPR), a method for Retrieval-Augmented Generation (RAG) that ensures strong copyright protection in a mixed-private setting for diffusion models. CPR allows conditioning the output of diffusion models on a set of retrieved images while preventing the exposure of unique, identifiable information about those images. The method combines public and private distributions by merging their diffusion scores at inference, ensuring that the generated outputs do not contain private data. The authors prove that CPR satisfies Near Access Freeness (NAF), a relaxation of differential privacy, which bounds the amount of information an attacker can extract from the generated images. Two algorithms, CPR-KL and CPR-Choose, are provided for efficient copyright-protected sampling without the need for rejection sampling. Empirical results show that CPR improves text-to-image alignment and enhances the quality of generated images while maintaining privacy guarantees.The paper introduces Copy-Protected Generation with Retrieval (CPR), a method for Retrieval-Augmented Generation (RAG) that ensures strong copyright protection in a mixed-private setting for diffusion models. CPR allows conditioning the output of diffusion models on a set of retrieved images while preventing the exposure of unique, identifiable information about those images. The method combines public and private distributions by merging their diffusion scores at inference, ensuring that the generated outputs do not contain private data. The authors prove that CPR satisfies Near Access Freeness (NAF), a relaxation of differential privacy, which bounds the amount of information an attacker can extract from the generated images. Two algorithms, CPR-KL and CPR-Choose, are provided for efficient copyright-protected sampling without the need for rejection sampling. Empirical results show that CPR improves text-to-image alignment and enhances the quality of generated images while maintaining privacy guarantees.
Reach us at info@study.space
[slides and audio] CPR%3A Retrieval Augmented Generation for Copyright Protection