This survey provides an in-depth analysis of knowledge conflicts in large language models (LLMs), highlighting the complex challenges they face when integrating contextual and parametric knowledge. The paper categorizes knowledge conflicts into three types: context-memory, inter-context, and intra-memory conflicts. These conflicts can significantly affect the trustworthiness and performance of LLMs, especially in real-world applications where noise and misinformation are common. The survey explores the causes, behaviors of LLMs under such conflicts, and available solutions, aiming to improve the robustness of LLMs and serve as a valuable resource for advancing research in this area.
LLMs are known for encapsulating a vast repository of world knowledge (parametric knowledge) and can also engage with external contextual knowledge (e.g., user prompts, retrieved documents, or tools). Integrating contextual knowledge allows LLMs to stay updated with current events and generate more accurate responses, but it risks conflicts due to the rich and diverse knowledge sources. The discrepancies between contextual and parametric knowledge are referred to as knowledge conflicts. The paper categorizes three types of knowledge conflicts: context-memory (CM), inter-context (IC), and intra-memory (IM).
Context-memory conflict arises from a discrepancy between the context and parametric knowledge. It can be caused by temporal misalignment or misinformation pollution. Temporal misalignment occurs when the model's parametric knowledge is outdated, and misinformation pollution involves the presence of false or misleading information in the context. Solutions to address these conflicts include knowledge editing, retrieval-augmented generation, and continuous learning.
Inter-context conflict occurs when multiple external documents contain conflicting information. This can be caused by misinformation or outdated information. The paper discusses the impact of these conflicts on LLMs and proposes solutions such as specialized models, general models, and query augmentation techniques.
Intra-memory conflict arises when LLMs generate inconsistent responses to semantically equivalent inputs. This can be caused by training corpus bias, decoding strategies, or knowledge editing. The paper explores the impact of these conflicts on LLMs and proposes solutions such as fine-tuning, plug-in methods, and output ensemble techniques.
The paper also discusses the challenges and future directions in addressing knowledge conflicts in LLMs, emphasizing the need for more research on real-world scenarios and more fine-grained approaches to solving these conflicts.This survey provides an in-depth analysis of knowledge conflicts in large language models (LLMs), highlighting the complex challenges they face when integrating contextual and parametric knowledge. The paper categorizes knowledge conflicts into three types: context-memory, inter-context, and intra-memory conflicts. These conflicts can significantly affect the trustworthiness and performance of LLMs, especially in real-world applications where noise and misinformation are common. The survey explores the causes, behaviors of LLMs under such conflicts, and available solutions, aiming to improve the robustness of LLMs and serve as a valuable resource for advancing research in this area.
LLMs are known for encapsulating a vast repository of world knowledge (parametric knowledge) and can also engage with external contextual knowledge (e.g., user prompts, retrieved documents, or tools). Integrating contextual knowledge allows LLMs to stay updated with current events and generate more accurate responses, but it risks conflicts due to the rich and diverse knowledge sources. The discrepancies between contextual and parametric knowledge are referred to as knowledge conflicts. The paper categorizes three types of knowledge conflicts: context-memory (CM), inter-context (IC), and intra-memory (IM).
Context-memory conflict arises from a discrepancy between the context and parametric knowledge. It can be caused by temporal misalignment or misinformation pollution. Temporal misalignment occurs when the model's parametric knowledge is outdated, and misinformation pollution involves the presence of false or misleading information in the context. Solutions to address these conflicts include knowledge editing, retrieval-augmented generation, and continuous learning.
Inter-context conflict occurs when multiple external documents contain conflicting information. This can be caused by misinformation or outdated information. The paper discusses the impact of these conflicts on LLMs and proposes solutions such as specialized models, general models, and query augmentation techniques.
Intra-memory conflict arises when LLMs generate inconsistent responses to semantically equivalent inputs. This can be caused by training corpus bias, decoding strategies, or knowledge editing. The paper explores the impact of these conflicts on LLMs and proposes solutions such as fine-tuning, plug-in methods, and output ensemble techniques.
The paper also discusses the challenges and future directions in addressing knowledge conflicts in LLMs, emphasizing the need for more research on real-world scenarios and more fine-grained approaches to solving these conflicts.