4 Jun 2024 | Matthieu Meeus, Igor Shilov, Manuel Faysse, Yves-Alexandre de Montjoye
The paper explores the use of copyright traps to detect the use of copyrighted materials in Large Language Models (LLMs) through document-level membership inference. Traditional methods rely on natural memorization of content, which is effective for larger models but not for smaller ones. The authors propose injecting fictitious entries (traps) into original content to enhance detectability, especially for models that do not naturally memorize. They design a randomized controlled experiment, training a 1.3B LLM from scratch and inserting traps into books. The study finds that short and medium-length trap sequences repeated 100 times are undetectable, but longer sequences repeated 1,000 times can be reliably detected (AUC=0.75). The results also show that sequences with higher perplexity are more detectable, and the relationship between perplexity and detectability can be a confounding factor in post-hoc studies of LLM memorization. The findings contribute to both copyright protection and the understanding of LLM memorization.The paper explores the use of copyright traps to detect the use of copyrighted materials in Large Language Models (LLMs) through document-level membership inference. Traditional methods rely on natural memorization of content, which is effective for larger models but not for smaller ones. The authors propose injecting fictitious entries (traps) into original content to enhance detectability, especially for models that do not naturally memorize. They design a randomized controlled experiment, training a 1.3B LLM from scratch and inserting traps into books. The study finds that short and medium-length trap sequences repeated 100 times are undetectable, but longer sequences repeated 1,000 times can be reliably detected (AUC=0.75). The results also show that sequences with higher perplexity are more detectable, and the relationship between perplexity and detectability can be a confounding factor in post-hoc studies of LLM memorization. The findings contribute to both copyright protection and the understanding of LLM memorization.