29 May 2024 | Jiaqi Li, Qianshan Wei, Chuanyi Zhang, Guilin Qi, Miaozeng Du, Yongrui Chen, Sheng Bi
The paper introduces Single Image Unlearning (SIU), an efficient method to unlearn the visual recognition of concepts in Multimodal Large Language Models (MLLMs) using only a single training image. SIU addresses the challenge of unlearning visual recognition by constructing multifaceted fine-tuning data and introducing a Dual Masked KL-divergence Loss. The method aims to align the model's output distribution with that of a model trained without the target concept, assign new visual descriptions, decouple factual knowledge, and preserve non-targeted knowledge. The authors also establish MMUBench, a comprehensive benchmark for evaluating machine unlearning in MLLMs, and introduce metrics for efficacy, generality, specificity, fluency, and diversity. Experimental results show that SIU outperforms existing methods in all evaluation metrics and demonstrates robustness against membership inference and jailbreak attacks. The paper highlights the positive butterfly effect of unlearning, where the model retains some knowledge while effectively forgetting others, and discusses the trade-offs between unlearning effectiveness and model utility.The paper introduces Single Image Unlearning (SIU), an efficient method to unlearn the visual recognition of concepts in Multimodal Large Language Models (MLLMs) using only a single training image. SIU addresses the challenge of unlearning visual recognition by constructing multifaceted fine-tuning data and introducing a Dual Masked KL-divergence Loss. The method aims to align the model's output distribution with that of a model trained without the target concept, assign new visual descriptions, decouple factual knowledge, and preserve non-targeted knowledge. The authors also establish MMUBench, a comprehensive benchmark for evaluating machine unlearning in MLLMs, and introduce metrics for efficacy, generality, specificity, fluency, and diversity. Experimental results show that SIU outperforms existing methods in all evaluation metrics and demonstrates robustness against membership inference and jailbreak attacks. The paper highlights the positive butterfly effect of unlearning, where the model retains some knowledge while effectively forgetting others, and discusses the trade-offs between unlearning effectiveness and model utility.