[slides and audio] DE-COP%3A Detecting Copyrighted Content in Language Models Training Data

DE-COP is a novel method designed to detect copyrighted content in the training data of language models (LMs). The core approach involves using multiple-choice questions with options including verbatim text and its paraphrases to probe the LLM. The method is evaluated on two benchmarks: BookTection, which includes 165 books published before and after a model's training cutoff, and arXivTection, a collection of recent and old arXiv research papers. DE-COP outperforms existing methods by 9.6% in AUC on models with logits available and achieves an average accuracy of 72% on fully black-box models. The method also includes a calibration technique to minimize selection bias in the model's probability assignments. The results suggest that LMs can accurately identify copyrighted content, highlighting the need for ethical and legal standards in LM development.DE-COP is a novel method designed to detect copyrighted content in the training data of language models (LMs). The core approach involves using multiple-choice questions with options including verbatim text and its paraphrases to probe the LLM. The method is evaluated on two benchmarks: BookTection, which includes 165 books published before and after a model's training cutoff, and arXivTection, a collection of recent and old arXiv research papers. DE-COP outperforms existing methods by 9.6% in AUC on models with logits available and achieves an average accuracy of 72% on fully black-box models. The method also includes a calibration technique to minimize selection bias in the model's probability assignments. The results suggest that LMs can accurately identify copyrighted content, highlighting the need for ethical and legal standards in LM development.

DE-COP: Detecting Copyrighted Content in Language Models Training Data

25 Jun 2024 | André V. Duarte, Xuandong Zhao, Arlindo L. Oliveira, Lei Li