Model Editing by Standard Fine-Tuning

Model Editing by Standard Fine-Tuning

3 Jun 2024 | Govind Gangadhar and Karl Stratos
Standard fine-tuning is often considered less effective than specialized methods for model editing due to its lower performance. However, it is simple, model-agnostic, and can leverage advances in standard training techniques, making it an appealing choice for model editing. This work shows that standard fine-tuning can achieve competitive model editing performance with two minor modifications: optimizing conditional likelihood instead of full likelihood, and training on random or similar unedited facts to encourage locality. Experiments on the ZsRE and COUNTERFACT datasets demonstrate that these modifications allow standard fine-tuning to match or outperform specialized editors in terms of edit score. Model editing aims to alter a language model to memorize desired information and apply it to new prompts without changing inferences that should remain unchanged. There is a trade-off between efficacy (memorizing edits), generalization (applying information to new prompts), and locality (preserving inferences on unrelated facts). Naive fine-tuning performs poorly, especially in locality, leading to the development of specialized editors. However, this work shows that standard fine-tuning can achieve competitive performance with minor modifications. The main task is mass-editing, where the model is edited to uphold the relations expressed in a set of facts without changing its behavior on other facts. The model is evaluated on efficacy, generalization, and locality. The results show that standard fine-tuning with conditional likelihood optimization and data augmentation can match or outperform specialized editors. The method involves optimizing the conditional likelihood of the edit target and augmenting the training data with paraphrased prompts and random facts. This approach is effective for both mass-editing and single-editing. Experiments on ZsRE and COUNTERFACT show that the method achieves competitive performance, with standard fine-tuning outperforming specialized editors in some cases. The results also show that while standard fine-tuning improves edit score, it may negatively affect generative metrics such as fluency and consistency. However, incorporating a language modeling loss on Wikipedia text can improve these metrics without significantly affecting the edit score. The work challenges the assumption that standard fine-tuning is ineffective for model editing, showing that it can achieve strong performance with minor modifications. The results suggest that model editing can be achieved as part of standard training rather than through specialized model editors.Standard fine-tuning is often considered less effective than specialized methods for model editing due to its lower performance. However, it is simple, model-agnostic, and can leverage advances in standard training techniques, making it an appealing choice for model editing. This work shows that standard fine-tuning can achieve competitive model editing performance with two minor modifications: optimizing conditional likelihood instead of full likelihood, and training on random or similar unedited facts to encourage locality. Experiments on the ZsRE and COUNTERFACT datasets demonstrate that these modifications allow standard fine-tuning to match or outperform specialized editors in terms of edit score. Model editing aims to alter a language model to memorize desired information and apply it to new prompts without changing inferences that should remain unchanged. There is a trade-off between efficacy (memorizing edits), generalization (applying information to new prompts), and locality (preserving inferences on unrelated facts). Naive fine-tuning performs poorly, especially in locality, leading to the development of specialized editors. However, this work shows that standard fine-tuning can achieve competitive performance with minor modifications. The main task is mass-editing, where the model is edited to uphold the relations expressed in a set of facts without changing its behavior on other facts. The model is evaluated on efficacy, generalization, and locality. The results show that standard fine-tuning with conditional likelihood optimization and data augmentation can match or outperform specialized editors. The method involves optimizing the conditional likelihood of the edit target and augmenting the training data with paraphrased prompts and random facts. This approach is effective for both mass-editing and single-editing. Experiments on ZsRE and COUNTERFACT show that the method achieves competitive performance, with standard fine-tuning outperforming specialized editors in some cases. The results also show that while standard fine-tuning improves edit score, it may negatively affect generative metrics such as fluency and consistency. However, incorporating a language modeling loss on Wikipedia text can improve these metrics without significantly affecting the edit score. The work challenges the assumption that standard fine-tuning is ineffective for model editing, showing that it can achieve strong performance with minor modifications. The results suggest that model editing can be achieved as part of standard training rather than through specialized model editors.
Reach us at info@study.space
[slides] Model Editing by Standard Fine-Tuning | StudySpace