A Survey on Hallucination in Large Vision-Language Models

A Survey on Hallucination in Large Vision-Language Models

6 May 2024 | Hanchao Liu, Wenyuan Xue, Yifei Chen, Dapeng Chen, Xiutian Zhao, Ke Wang, Liping Hou, Rongjun Li, Wei Peng
This survey explores hallucination in Large Vision-Language Models (LVLMs), a critical issue affecting their practical application. Hallucination refers to discrepancies between factual visual content and corresponding textual generation. The paper provides an in-depth analysis of hallucination symptoms, causes, and mitigation strategies in LVLMs. It discusses the challenges of hallucination in LVLMs, including data bias, limitations of vision encoders, misalignment of modalities, and inherent issues in large language models (LLMs). The survey also presents evaluation methods and benchmarks for assessing hallucination in LVLMs, including non-hallucinatory generation and hallucination discrimination. Various mitigation approaches are proposed, such as improving training data, enhancing vision encoders, refining connection modules, and optimizing LLMs. The paper highlights the need for further research in hallucination detection, mitigation, and the development of more reliable and efficient LVLMs. The survey concludes with future directions for research in LVLM hallucination, emphasizing the importance of understanding the underlying causes and developing effective solutions to address this challenge.This survey explores hallucination in Large Vision-Language Models (LVLMs), a critical issue affecting their practical application. Hallucination refers to discrepancies between factual visual content and corresponding textual generation. The paper provides an in-depth analysis of hallucination symptoms, causes, and mitigation strategies in LVLMs. It discusses the challenges of hallucination in LVLMs, including data bias, limitations of vision encoders, misalignment of modalities, and inherent issues in large language models (LLMs). The survey also presents evaluation methods and benchmarks for assessing hallucination in LVLMs, including non-hallucinatory generation and hallucination discrimination. Various mitigation approaches are proposed, such as improving training data, enhancing vision encoders, refining connection modules, and optimizing LLMs. The paper highlights the need for further research in hallucination detection, mitigation, and the development of more reliable and efficient LVLMs. The survey concludes with future directions for research in LVLM hallucination, emphasizing the importance of understanding the underlying causes and developing effective solutions to address this challenge.
Reach us at info@study.space