Be like a Goldfish, Don’t Memorize! Mitigating Memorization in Generative LLMs

Be like a Goldfish, Don’t Memorize! Mitigating Memorization in Generative LLMs

14 Jun 2024 | Abhimanyu Hans1, Yuxin Wen1, Neel Jain1, John Kirchenbauer1 Hamid Kazemi1, Prajwal Singhania1, Siddharth Singh1, Gowthami Somepalli1 Jonas Geiping2,3, Abhinav Bhatele1, Tom Goldstein1
The paper introduces the *goldfish loss*, a technique to mitigate memorization in large language models (LLMs). Memorization, where models store and regenerate training data verbatim, poses risks of copyright and privacy violations. The goldfish loss is a modification to the next-token training objective that excludes a random subset of tokens from the loss computation, preventing the model from memorizing these tokens. Extensive experiments on billion-scale Llama-2 models show significant reductions in extractable memorization with minimal impact on downstream performance. The goldfish loss is effective in preventing verbatim reproduction of training data, even in aggressive memorization scenarios. It also maintains the model's ability to learn effectively from the training data, although it may require more training time. The paper discusses the robust handling of duplicate passages using hashing and evaluates the goldfish loss's impact on downstream benchmarks and language modeling ability. While the goldfish loss does not provide guarantees against all forms of extraction, it significantly reduces the likelihood of long-form verbatim memorization. The authors conclude that the goldfish loss is a useful tool for mitigating memorization risks in industrial settings, offering simplicity, scalability, and minimal performance impact.The paper introduces the *goldfish loss*, a technique to mitigate memorization in large language models (LLMs). Memorization, where models store and regenerate training data verbatim, poses risks of copyright and privacy violations. The goldfish loss is a modification to the next-token training objective that excludes a random subset of tokens from the loss computation, preventing the model from memorizing these tokens. Extensive experiments on billion-scale Llama-2 models show significant reductions in extractable memorization with minimal impact on downstream performance. The goldfish loss is effective in preventing verbatim reproduction of training data, even in aggressive memorization scenarios. It also maintains the model's ability to learn effectively from the training data, although it may require more training time. The paper discusses the robust handling of duplicate passages using hashing and evaluates the goldfish loss's impact on downstream benchmarks and language modeling ability. While the goldfish loss does not provide guarantees against all forms of extraction, it significantly reduces the likelihood of long-form verbatim memorization. The authors conclude that the goldfish loss is a useful tool for mitigating memorization risks in industrial settings, offering simplicity, scalability, and minimal performance impact.
Reach us at info@study.space