Propagation and Pitfalls: Reasoning-based Assessment of Knowledge Editing through Counterfactual Tasks

Propagation and Pitfalls: Reasoning-based Assessment of Knowledge Editing through Counterfactual Tasks

31 Jan 2024 | Wenyue Hua†*, Jiang Guo†*, Mingwen Dong†, Henghui Zhu†, Patrick Ng†, Zhiguo Wang†
This paper addresses the challenges of effectively propagating updates to interconnected facts in knowledge editing models. To evaluate these models, the authors introduce a novel reasoning-based benchmark called ReCoE (Reasoning-based Counterfactual Editing dataset), which covers six common reasoning schemes in real-world scenarios. The benchmark is designed to capture the complexities of fact editing tasks more accurately than existing benchmarks, which often rely on synthetic data. The authors analyze existing knowledge editing techniques, including input-augmentation, finetuning, and locate-and-edit methods, and find that all methods show low performance on the ReCoE dataset, especially in certain reasoning schemes. They further analyze the chain-of-thought (CoT) generation of edited models to uncover key reasons behind the inadequacy of these methods, focusing on aspects such as fact-wise editing, fact recall ability, and coherence in generation. The paper concludes with a comprehensive analysis of the challenges and limitations of current knowledge editing approaches and provides valuable insights for future research. The authors also discuss the impact of model scaling and the effectiveness of different editing methods, highlighting the need for more effective knowledge editing techniques to enhance the reliability and efficacy of computational models in handling real-world scenarios.This paper addresses the challenges of effectively propagating updates to interconnected facts in knowledge editing models. To evaluate these models, the authors introduce a novel reasoning-based benchmark called ReCoE (Reasoning-based Counterfactual Editing dataset), which covers six common reasoning schemes in real-world scenarios. The benchmark is designed to capture the complexities of fact editing tasks more accurately than existing benchmarks, which often rely on synthetic data. The authors analyze existing knowledge editing techniques, including input-augmentation, finetuning, and locate-and-edit methods, and find that all methods show low performance on the ReCoE dataset, especially in certain reasoning schemes. They further analyze the chain-of-thought (CoT) generation of edited models to uncover key reasons behind the inadequacy of these methods, focusing on aspects such as fact-wise editing, fact recall ability, and coherence in generation. The paper concludes with a comprehensive analysis of the challenges and limitations of current knowledge editing approaches and provides valuable insights for future research. The authors also discuss the impact of model scaling and the effectiveness of different editing methods, highlighting the need for more effective knowledge editing techniques to enhance the reliability and efficacy of computational models in handling real-world scenarios.
Reach us at info@study.space
[slides] Propagation and Pitfalls%3A Reasoning-based Assessment of Knowledge Editing through Counterfactual Tasks | StudySpace