Machine Unlearning of Pre-trained Large Language Models

Machine Unlearning of Pre-trained Large Language Models

30 May 2024 | Jin Yao, Eli Chien, Minxin Du, Xinyao Niu, Tianhao Wang, Zezhou Cheng, Xiang Yue
This study explores the concept of "the right to be forgotten" in the context of large language models (LLMs), focusing on the under-researched area of machine unlearning for pre-trained models. The authors propose a comprehensive framework for machine unlearning in pre-trained LLMs, evaluating seven diverse unlearning methods using curated datasets from arXiv, books, and GitHub. They demonstrate that these methods are over 10^5 times more computationally efficient than retraining and show that integrating gradient ascent with gradient descent on in-distribution data improves hyperparameter robustness. The paper also provides detailed guidelines for efficient hyperparameter tuning in the unlearning process, advancing the discourse on ethical AI practices and offering insights into responsible AI development. The study addresses the challenges of adapting existing unlearning methods to pre-trained LLMs, the lack of public availability of pre-trained data, and the absence of comparable baselines due to the high costs of retraining. The main contributions include a unified unlearning framework, an approximate retraining evaluation baseline, and open-source datasets. The results highlight the effectiveness of the proposed methods in unlearning pre-trained LLMs, particularly in removing copyrighted content, and offer practical guidelines for hyperparameter tuning.This study explores the concept of "the right to be forgotten" in the context of large language models (LLMs), focusing on the under-researched area of machine unlearning for pre-trained models. The authors propose a comprehensive framework for machine unlearning in pre-trained LLMs, evaluating seven diverse unlearning methods using curated datasets from arXiv, books, and GitHub. They demonstrate that these methods are over 10^5 times more computationally efficient than retraining and show that integrating gradient ascent with gradient descent on in-distribution data improves hyperparameter robustness. The paper also provides detailed guidelines for efficient hyperparameter tuning in the unlearning process, advancing the discourse on ethical AI practices and offering insights into responsible AI development. The study addresses the challenges of adapting existing unlearning methods to pre-trained LLMs, the lack of public availability of pre-trained data, and the absence of comparable baselines due to the high costs of retraining. The main contributions include a unified unlearning framework, an approximate retraining evaluation baseline, and open-source datasets. The results highlight the effectiveness of the proposed methods in unlearning pre-trained LLMs, particularly in removing copyrighted content, and offer practical guidelines for hyperparameter tuning.
Reach us at info@study.space
[slides and audio] Machine Unlearning of Pre-trained Large Language Models