[slides and audio] AUTOHALLUSION%3A Automatic Generation of Hallucination Benchmarks for Vision-Language Models

**Abstract:** Large vision-language models (LVLMs) often hallucinate, generating incorrect responses that include abnormal or hypothetical objects. While some benchmarks have been developed to investigate these hallucinations, they rely on hand-crafted corner cases, which may not generalize well and can lead to overfitting. To address this, the authors propose AUTOHALLUSION, an automatic benchmark generation approach that creates diverse hallucination examples. AUTOHALLUSION uses three main strategies—abnormal object insertion, paired object insertion, and correlated object removal—to synthesize images that conflict with the language modules' priors. It then generates questions about object existence and spatial relations to induce hallucinations. The method is evaluated on synthetic and real-world datasets, achieving high success rates in inducing hallucinations across top-tier LVLMs such as GPT-4V(ision), Gemini Pro Vision, Claude 3, and LLaVA-1.5. **Main Contributions:** - Development of the first automatic benchmark generation approach for hallucination induction. - Introduction of novel probing methods to extract and investigate contextual biases in language priors. - Development of two evaluation metrics to detect hallucinations. - Comprehensive evaluation of state-of-the-art LVLMs, achieving 97.7% and 98.7% success rates in inducing hallucinations on synthetic and real-world datasets, respectively. **Related Work:** The paper reviews existing work on vision-language models, benchmarks for hallucination evaluation, and object hallucination issues. It highlights the limitations of current methods, such as the labor-intensive nature of creating hallucination examples and the lack of generalizability of handcrafted benchmarks. **Problem Formulation:** The objective is to find context elements that are correlated in the language model but not present in the image to induce hallucinations. The method involves maximizing the distance between the generated text and the ground truth to produce hallucinations. **Methodology:** The methodology includes scene generation, image manipulation, question construction, and hallucination detection. The authors use diffusion models or image generation models to create context-rich scenes and manipulate objects to induce hallucinations. Questions are constructed to probe object existence and spatial relations, and hallucinations are detected through correctness and consistency of responses. **Evaluation and Metrics:** The evaluation metrics include Attack Success Rate (ASR), Manipulation Attack Success Rate (MASR), and Conflict Attack Success Rate (CASR). The results show high ASR and MASR across synthetic and real-world datasets, with GPT-4V-Turbo demonstrating superior robustness to hallucination attacks. **Ablation Studies:** The paper discusses the impact of object size, object prompting, and object-scene alignment on hallucination induction. Larger objects reduce hallucinations, and different LVLMs show varying levels of robustness to hallucination attacks. **Conclusion:** AUTOHALLUSION effectively induces hallucinations in LVLMs, providing insights into common failure patterns and mechanisms**Abstract:** Large vision-language models (LVLMs) often hallucinate, generating incorrect responses that include abnormal or hypothetical objects. While some benchmarks have been developed to investigate these hallucinations, they rely on hand-crafted corner cases, which may not generalize well and can lead to overfitting. To address this, the authors propose AUTOHALLUSION, an automatic benchmark generation approach that creates diverse hallucination examples. AUTOHALLUSION uses three main strategies—abnormal object insertion, paired object insertion, and correlated object removal—to synthesize images that conflict with the language modules' priors. It then generates questions about object existence and spatial relations to induce hallucinations. The method is evaluated on synthetic and real-world datasets, achieving high success rates in inducing hallucinations across top-tier LVLMs such as GPT-4V(ision), Gemini Pro Vision, Claude 3, and LLaVA-1.5. **Main Contributions:** - Development of the first automatic benchmark generation approach for hallucination induction. - Introduction of novel probing methods to extract and investigate contextual biases in language priors. - Development of two evaluation metrics to detect hallucinations. - Comprehensive evaluation of state-of-the-art LVLMs, achieving 97.7% and 98.7% success rates in inducing hallucinations on synthetic and real-world datasets, respectively. **Related Work:** The paper reviews existing work on vision-language models, benchmarks for hallucination evaluation, and object hallucination issues. It highlights the limitations of current methods, such as the labor-intensive nature of creating hallucination examples and the lack of generalizability of handcrafted benchmarks. **Problem Formulation:** The objective is to find context elements that are correlated in the language model but not present in the image to induce hallucinations. The method involves maximizing the distance between the generated text and the ground truth to produce hallucinations. **Methodology:** The methodology includes scene generation, image manipulation, question construction, and hallucination detection. The authors use diffusion models or image generation models to create context-rich scenes and manipulate objects to induce hallucinations. Questions are constructed to probe object existence and spatial relations, and hallucinations are detected through correctness and consistency of responses. **Evaluation and Metrics:** The evaluation metrics include Attack Success Rate (ASR), Manipulation Attack Success Rate (MASR), and Conflict Attack Success Rate (CASR). The results show high ASR and MASR across synthetic and real-world datasets, with GPT-4V-Turbo demonstrating superior robustness to hallucination attacks. **Ablation Studies:** The paper discusses the impact of object size, object prompting, and object-scene alignment on hallucination induction. Larger objects reduce hallucinations, and different LVLMs show varying levels of robustness to hallucination attacks. **Conclusion:** AUTOHALLUSION effectively induces hallucinations in LVLMs, providing insights into common failure patterns and mechanisms

AUTOHALLUSION: Automatic Generation of Hallucination Benchmarks for Vision-Language Models

16 Jun 2024 | Xiyang Wu, Tianrui Guan, Dianqi Li, Shuaiyi Huang, Xiaoyu Liu, Xijun Wang, Ruiqi Xian, Abhinav Shrivastava, Furong Huang, Jordan Lee Boyd-Graber, Tianyi Zhou, Dinesh Manocha

AUTOHALLUSION: Automatic Generation of Hallucination Benchmarks for Vision-Language Models

16 Jun 2024 | Xiyang Wu*, Tianrui Guan*, Dianqi Li, Shuaiyi Huang, Xiaoyu Liu, Xijun Wang, Ruiqi Xian, Abhinav Shrivastava, Furong Huang, Jordan Lee Boyd-Graber, Tianyi Zhou, Dinesh Manocha

16 Jun 2024 | Xiyang Wu, Tianrui Guan, Dianqi Li, Shuaiyi Huang, Xiaoyu Liu, Xijun Wang, Ruiqi Xian, Abhinav Shrivastava, Furong Huang, Jordan Lee Boyd-Graber, Tianyi Zhou, Dinesh Manocha