12 Jun 2024 | Jiabao Ji1*, Yujuan Liu1, Yang Zhang2, Gaowen Liu3, Ramana Rao Komppella3, Sijia Liu4, Shiyu Chang1
This paper introduces a novel LLM unlearning framework called Unlearning from Logit Difference (ULD), which aims to address the challenges of degenerated output and catastrophic forgetting in conventional LLM unlearning methods. ULD achieves this by training an assistant LLM that aims to remember the forget documents and forget the retain knowledge. The unlearned LLM is then derived by computing the logit difference between the target LLM and the assistant LLM. This approach naturally resolves the issues of unbounded forget loss and under-representative retain data, leading to improved training efficiency and better forgetting performance. Extensive experiments on the TOFU and Harry Potter datasets demonstrate that ULD can achieve 0% loss of model utility while reducing training time by more than threefold compared to baseline methods. The proposed framework is also shown to be effective in maintaining the LLM's overall capabilities, making it a promising solution for improving privacy and data leakage issues in LLM usage.This paper introduces a novel LLM unlearning framework called Unlearning from Logit Difference (ULD), which aims to address the challenges of degenerated output and catastrophic forgetting in conventional LLM unlearning methods. ULD achieves this by training an assistant LLM that aims to remember the forget documents and forget the retain knowledge. The unlearned LLM is then derived by computing the logit difference between the target LLM and the assistant LLM. This approach naturally resolves the issues of unbounded forget loss and under-representative retain data, leading to improved training efficiency and better forgetting performance. Extensive experiments on the TOFU and Harry Potter datasets demonstrate that ULD can achieve 0% loss of model utility while reducing training time by more than threefold compared to baseline methods. The proposed framework is also shown to be effective in maintaining the LLM's overall capabilities, making it a promising solution for improving privacy and data leakage issues in LLM usage.