This paper addresses the issue of model collapse during sequential model editing using Rank-One Model Editing (ROME). Previous studies showed that certain edits, called disabling edits, caused sudden model collapse, limiting the use of ROME for sequential editing. The authors identify that these disabling edits are not inherent to the optimization objective of ROME but are artifacts of irregularities in its implementation. They propose a more stable implementation of ROME, called r-ROME, which avoids model collapse during large-scale sequential edits and improves the generalization and locality of model editing compared to the original ROME.
The paper introduces two metrics to identify disabling edits: generation entropy and the norm of matrix update. Using these metrics, the authors evaluate the performance of ROME on two datasets, CounterFact and zsRE. They find that disabling edits only occur when editing facts from the CounterFact dataset, not the zsRE dataset. This is attributed to differences in the datasets, such as the type of facts and the format of prompts used.
The core issue with ROME's original implementation was the asymmetric usage of key-vectors in the update equation. The authors correct this by using a homogeneous key-vector in the update equation, leading to the development of r-ROME. This implementation significantly reduces the magnitude of updates (Δ), preventing model collapse and enabling stable and scalable model editing.
The paper also evaluates the performance of r-ROME and another variant, p-ROME, on downstream tasks. r-ROME shows better generalization and localization of edits, while p-ROME offers slightly higher efficacy at the expense of generalization. The results demonstrate that using homogeneous key-vectors is crucial for effective model editing.
The authors conclude that r-ROME provides a more stable and scalable approach to model editing, enabling sequential editing without performance loss. However, they note that downstream performance degradation and decreased stability can still occur at scale, which is an inherent limitation of ROME.This paper addresses the issue of model collapse during sequential model editing using Rank-One Model Editing (ROME). Previous studies showed that certain edits, called disabling edits, caused sudden model collapse, limiting the use of ROME for sequential editing. The authors identify that these disabling edits are not inherent to the optimization objective of ROME but are artifacts of irregularities in its implementation. They propose a more stable implementation of ROME, called r-ROME, which avoids model collapse during large-scale sequential edits and improves the generalization and locality of model editing compared to the original ROME.
The paper introduces two metrics to identify disabling edits: generation entropy and the norm of matrix update. Using these metrics, the authors evaluate the performance of ROME on two datasets, CounterFact and zsRE. They find that disabling edits only occur when editing facts from the CounterFact dataset, not the zsRE dataset. This is attributed to differences in the datasets, such as the type of facts and the format of prompts used.
The core issue with ROME's original implementation was the asymmetric usage of key-vectors in the update equation. The authors correct this by using a homogeneous key-vector in the update equation, leading to the development of r-ROME. This implementation significantly reduces the magnitude of updates (Δ), preventing model collapse and enabling stable and scalable model editing.
The paper also evaluates the performance of r-ROME and another variant, p-ROME, on downstream tasks. r-ROME shows better generalization and localization of edits, while p-ROME offers slightly higher efficacy at the expense of generalization. The results demonstrate that using homogeneous key-vectors is crucial for effective model editing.
The authors conclude that r-ROME provides a more stable and scalable approach to model editing, enabling sequential editing without performance loss. However, they note that downstream performance degradation and decreased stability can still occur at scale, which is an inherent limitation of ROME.