16 Jul 2019 | Nicholas Carlini, Chang Liu, Úlfar Erlingsson, Jernej Kos, Dawn Song
This paper introduces a testing methodology to quantitatively assess the risk that rare or unique training-data sequences are unintentionally memorized by generative sequence models. Such models are often trained on sensitive data, making this methodology valuable for privacy protection. The authors demonstrate that unintended memorization is a persistent issue that can have serious consequences. They describe new, efficient procedures to extract secret sequences like credit card numbers and show that their testing strategy is practical and effective. The methodology is applied to Google's Smart Compose, a commercial text-completion model trained on millions of emails.
The paper explores how neural networks can unintentionally memorize training data, even when the data is rare. It shows that this memorization can occur early in training and persists across different models and training strategies. The authors propose an exposure metric to measure the likelihood of a model memorizing secret sequences. This metric is based on the relative difference in perplexity between canary sequences and equivalent random sequences. The exposure metric is used to evaluate how much a model has memorized secret data, and it is shown to be effective in identifying and quantifying unintended memorization.
The paper also discusses the limitations of traditional regularization techniques like early-stopping and dropout in preventing unintended memorization. Instead, the authors advocate for differentially-private training techniques to eliminate the issue. The exposure-based testing methodology is shown to be practical and effective in identifying and mitigating unintended memorization in neural networks. The results demonstrate that unintended memorization is a significant concern, and the proposed methodology provides a practical way to assess and reduce the risk of such memorization in real-world applications.This paper introduces a testing methodology to quantitatively assess the risk that rare or unique training-data sequences are unintentionally memorized by generative sequence models. Such models are often trained on sensitive data, making this methodology valuable for privacy protection. The authors demonstrate that unintended memorization is a persistent issue that can have serious consequences. They describe new, efficient procedures to extract secret sequences like credit card numbers and show that their testing strategy is practical and effective. The methodology is applied to Google's Smart Compose, a commercial text-completion model trained on millions of emails.
The paper explores how neural networks can unintentionally memorize training data, even when the data is rare. It shows that this memorization can occur early in training and persists across different models and training strategies. The authors propose an exposure metric to measure the likelihood of a model memorizing secret sequences. This metric is based on the relative difference in perplexity between canary sequences and equivalent random sequences. The exposure metric is used to evaluate how much a model has memorized secret data, and it is shown to be effective in identifying and quantifying unintended memorization.
The paper also discusses the limitations of traditional regularization techniques like early-stopping and dropout in preventing unintended memorization. Instead, the authors advocate for differentially-private training techniques to eliminate the issue. The exposure-based testing methodology is shown to be practical and effective in identifying and mitigating unintended memorization in neural networks. The results demonstrate that unintended memorization is a significant concern, and the proposed methodology provides a practical way to assess and reduce the risk of such memorization in real-world applications.