23 Feb 2024 | Shenglai Zeng, Jiankun Zhang, Pengfei He, Yue Xing, Yiding Liu, Han Xu, Jie Ren, Shuaqiang Wang, Dawei Yin, Yi Chang
This paper investigates privacy risks in retrieval-augmented generation (RAG) systems, which integrate external data with large language models (LLMs) to enhance text generation. RAG systems can potentially expose private information from both the retrieval database and the LLM's training data. The authors conduct extensive empirical studies using novel attack methods to demonstrate the vulnerability of RAG systems to privacy leaks. They find that RAG systems can extract sensitive information from retrieval databases, such as medical records or personal data, and also reduce the leakage of LLM training data.
The study addresses two research questions: (RQ1) Can private data be extracted from the retrieval database? (RQ2) Can retrieval data affect the memorization of LLMs? For RQ1, the authors propose a composite structured prompting attack that combines information retrieval and command prompts to extract private data. They find that RAG systems are highly susceptible to such attacks, with high success rates in extracting sensitive information. For RQ2, the authors show that incorporating retrieval data into RAG systems can reduce the LLM's tendency to output memorized training data, providing greater protection than other methods like noise injection or system prompts.
The study also explores potential mitigation strategies, such as re-ranking and summarization, to reduce privacy risks. However, the results show that these methods have limited effectiveness in reducing privacy leaks, especially in targeted attacks. The authors conclude that RAG systems can significantly reduce the risk of privacy leaks from LLM training data, making them a safer architecture compared to using LLMs alone. The findings highlight the importance of addressing privacy concerns in RAG systems to ensure the secure use of external data in LLMs.This paper investigates privacy risks in retrieval-augmented generation (RAG) systems, which integrate external data with large language models (LLMs) to enhance text generation. RAG systems can potentially expose private information from both the retrieval database and the LLM's training data. The authors conduct extensive empirical studies using novel attack methods to demonstrate the vulnerability of RAG systems to privacy leaks. They find that RAG systems can extract sensitive information from retrieval databases, such as medical records or personal data, and also reduce the leakage of LLM training data.
The study addresses two research questions: (RQ1) Can private data be extracted from the retrieval database? (RQ2) Can retrieval data affect the memorization of LLMs? For RQ1, the authors propose a composite structured prompting attack that combines information retrieval and command prompts to extract private data. They find that RAG systems are highly susceptible to such attacks, with high success rates in extracting sensitive information. For RQ2, the authors show that incorporating retrieval data into RAG systems can reduce the LLM's tendency to output memorized training data, providing greater protection than other methods like noise injection or system prompts.
The study also explores potential mitigation strategies, such as re-ranking and summarization, to reduce privacy risks. However, the results show that these methods have limited effectiveness in reducing privacy leaks, especially in targeted attacks. The authors conclude that RAG systems can significantly reduce the risk of privacy leaks from LLM training data, making them a safer architecture compared to using LLMs alone. The findings highlight the importance of addressing privacy concerns in RAG systems to ensure the secure use of external data in LLMs.