10 Jun 2024 | Akshat Gupta, Anurag Rao, Gopala Anumanchipalli
The paper "Model Editing at Scale leads to Gradual and Catastrophic Forgetting" by Akshat Gupta, Anurag Rao, and Gopala Anumanchipalli from UC Berkeley explores the scalability and practical utility of model editing techniques in large language models (LLMs). The authors argue that current model editing methods, such as ROME and MEMIT, are evaluated using metrics for reliability, specificity, and generalization over a few edits, but their scalability is crucial for real-world applications. They evaluate these methods at scale, focusing on three key properties: editing proficiency, fact forgetting, and downstream task performance.
Key findings include:
1. **Gradual and Catastrophic Forgetting**: As the model is edited sequentially with multiple facts, it experiences two phases of forgetting. The first phase is gradual, where the model loses the ability to perform downstream tasks and recall previously edited facts. The second phase is catastrophic, where the model suddenly forgets all previously edited facts and becomes unable to perform any downstream tasks.
2. **Model Degradation**: Both ROME and MEMIT outperform MEND and fine-tuning baselines at scale, but their edits are not as localized as previously believed. New edits consistently affect other facts stored in the model.
3. **Downstream Task Performance**: The performance of edited models on downstream tasks degrades gradually and then abruptly, coinciding with the points of gradual and catastrophic forgetting.
4. **Disabling Edits**: Edits that disable the model's ability to perform downstream tasks are identified and analyzed, highlighting the fundamental limitations of ROME.
The authors call for better evaluation methods and the development of more scalable model editing techniques to address these issues. They also provide a detailed analysis of the source of forgetting, suggesting that the layers being edited drift away from their original weight values, leading to incompatibility with the rest of the model.
The paper concludes by emphasizing the need for model editing techniques to counteract both gradual and catastrophic forgetting to ensure practical utility at scale.The paper "Model Editing at Scale leads to Gradual and Catastrophic Forgetting" by Akshat Gupta, Anurag Rao, and Gopala Anumanchipalli from UC Berkeley explores the scalability and practical utility of model editing techniques in large language models (LLMs). The authors argue that current model editing methods, such as ROME and MEMIT, are evaluated using metrics for reliability, specificity, and generalization over a few edits, but their scalability is crucial for real-world applications. They evaluate these methods at scale, focusing on three key properties: editing proficiency, fact forgetting, and downstream task performance.
Key findings include:
1. **Gradual and Catastrophic Forgetting**: As the model is edited sequentially with multiple facts, it experiences two phases of forgetting. The first phase is gradual, where the model loses the ability to perform downstream tasks and recall previously edited facts. The second phase is catastrophic, where the model suddenly forgets all previously edited facts and becomes unable to perform any downstream tasks.
2. **Model Degradation**: Both ROME and MEMIT outperform MEND and fine-tuning baselines at scale, but their edits are not as localized as previously believed. New edits consistently affect other facts stored in the model.
3. **Downstream Task Performance**: The performance of edited models on downstream tasks degrades gradually and then abruptly, coinciding with the points of gradual and catastrophic forgetting.
4. **Disabling Edits**: Edits that disable the model's ability to perform downstream tasks are identified and analyzed, highlighting the fundamental limitations of ROME.
The authors call for better evaluation methods and the development of more scalable model editing techniques to address these issues. They also provide a detailed analysis of the source of forgetting, suggesting that the layers being edited drift away from their original weight values, leading to incompatibility with the rest of the model.
The paper concludes by emphasizing the need for model editing techniques to counteract both gradual and catastrophic forgetting to ensure practical utility at scale.