23 Jun 2024 | Roy Xie, Junlin Wang, Ruomin Huang, Minxing Zhang, Rong Ge, Jian Pei, Neil Zhenqiang Gong, Bhuwan Dhingra
This paper introduces RECALL, a novel membership inference attack (MIA) that leverages the conditional language modeling capabilities of large language models (LLMs) to detect pretraining data. RECALL measures the relative change in conditional log-likelihoods when prefixing target data points with non-member context. The key insight is that conditioning member data on non-member prefixes induces a larger decrease in log-likelihood compared to non-member data. Empirical results show that RECALL achieves state-of-the-art performance on the WikiMIA dataset, even with random and synthetic prefixes, and can be further improved using an ensemble approach. The method is also effective in the more challenging MIMIR benchmark, where members and non-members are similar. RECALL does not rely on access to the pretraining data distribution or a reference model, making it a practical solution for detecting pretraining data. The paper also provides an in-depth analysis of LLMs' behavior with different membership contexts, revealing how LLMs leverage membership information for effective inference at both the sequence and token level. The results demonstrate that RECALL is robust to random prefix selection and can outperform existing MIA methods. The method is also effective when using synthetic prefixes generated by LLMs, expanding its applicability to practical scenarios. The paper concludes that RECALL is a promising approach for detecting pretraining data in LLMs, with potential for further theoretical and practical improvements.This paper introduces RECALL, a novel membership inference attack (MIA) that leverages the conditional language modeling capabilities of large language models (LLMs) to detect pretraining data. RECALL measures the relative change in conditional log-likelihoods when prefixing target data points with non-member context. The key insight is that conditioning member data on non-member prefixes induces a larger decrease in log-likelihood compared to non-member data. Empirical results show that RECALL achieves state-of-the-art performance on the WikiMIA dataset, even with random and synthetic prefixes, and can be further improved using an ensemble approach. The method is also effective in the more challenging MIMIR benchmark, where members and non-members are similar. RECALL does not rely on access to the pretraining data distribution or a reference model, making it a practical solution for detecting pretraining data. The paper also provides an in-depth analysis of LLMs' behavior with different membership contexts, revealing how LLMs leverage membership information for effective inference at both the sequence and token level. The results demonstrate that RECALL is robust to random prefix selection and can outperform existing MIA methods. The method is also effective when using synthetic prefixes generated by LLMs, expanding its applicability to practical scenarios. The paper concludes that RECALL is a promising approach for detecting pretraining data in LLMs, with potential for further theoretical and practical improvements.