1 Jul 2024 | Avi Schwarzschild, Zhili Feng, Pratyush Maini, Zachary C. Lipton, J. Zico Kolter
This paper addresses the issue of memorization in large language models (LLMs) trained on web-scale datasets, which raises concerns about permissible data usage. The authors propose a new metric called Adversarial Compression Ratio (ACR) to assess memorization. ACR measures the ratio of input to output tokens for a given string, where a string is considered memorized if it can be generated with a much shorter prompt. This approach overcomes limitations of existing definitions by providing an adversarial view and allowing for flexible measurement of memorization for arbitrary strings. The paper includes a detailed algorithm, MINIPROMPT, to estimate ACR and demonstrates its effectiveness through various case studies, including in-context unlearning and unlearning of specific content. The results show that even after unlearning, significant portions of the memorized data remain compressible, highlighting the limitations of existing methods. The authors conclude that ACR serves as a practical tool for determining when model owners may be violating data usage terms and provides a critical lens for addressing such scenarios.This paper addresses the issue of memorization in large language models (LLMs) trained on web-scale datasets, which raises concerns about permissible data usage. The authors propose a new metric called Adversarial Compression Ratio (ACR) to assess memorization. ACR measures the ratio of input to output tokens for a given string, where a string is considered memorized if it can be generated with a much shorter prompt. This approach overcomes limitations of existing definitions by providing an adversarial view and allowing for flexible measurement of memorization for arbitrary strings. The paper includes a detailed algorithm, MINIPROMPT, to estimate ACR and demonstrates its effectiveness through various case studies, including in-context unlearning and unlearning of specific content. The results show that even after unlearning, significant portions of the memorized data remain compressible, highlighting the limitations of existing methods. The authors conclude that ACR serves as a practical tool for determining when model owners may be violating data usage terms and provides a critical lens for addressing such scenarios.