Benchmarking Hallucination in Large Language Models based on Unanswerable Math Word Problem

Benchmarking Hallucination in Large Language Models based on Unanswerable Math Word Problem

6 Mar 2024 | Yuhong Sun, Zhangyue Yin, Qipeng Guo, Jiawen Wu, Xipeng Qiu, Hui Zhao
This paper introduces a new method for evaluating hallucination in large language models (LLMs) based on unanswerable math word problems (MWP). The authors propose a dataset called Unanswerable Math Word Problem (UMWP), which contains 5,200 questions, half of which are answerable and half unanswerable. The dataset is constructed by modifying existing MWP datasets to create unanswerable questions. The authors develop an evaluation methodology combining text similarity and mathematical expression detection to determine whether LLMs consider a question unanswerable. The results of experiments on 31 LLMs, including GPT-3, InstructGPT, LLaMA, and Claude, show that in-context learning and reinforcement learning with human feedback (RLHF) training significantly enhance the model's ability to avoid hallucination. The study demonstrates that using MWP is a reliable and effective approach to assess hallucination in LLMs. The authors also show that LLMs can recognize unanswerable questions and may output variable expressions or phrases indicating uncertainty. The results of experiments highlight the impact of model size, input form, and RLHF on hallucination mitigation. The study concludes that the proposed method provides a feasible way of assessing hallucination in LLMs. The dataset and code are available at https://github.com/Yuki-Asuuna/UMWP.This paper introduces a new method for evaluating hallucination in large language models (LLMs) based on unanswerable math word problems (MWP). The authors propose a dataset called Unanswerable Math Word Problem (UMWP), which contains 5,200 questions, half of which are answerable and half unanswerable. The dataset is constructed by modifying existing MWP datasets to create unanswerable questions. The authors develop an evaluation methodology combining text similarity and mathematical expression detection to determine whether LLMs consider a question unanswerable. The results of experiments on 31 LLMs, including GPT-3, InstructGPT, LLaMA, and Claude, show that in-context learning and reinforcement learning with human feedback (RLHF) training significantly enhance the model's ability to avoid hallucination. The study demonstrates that using MWP is a reliable and effective approach to assess hallucination in LLMs. The authors also show that LLMs can recognize unanswerable questions and may output variable expressions or phrases indicating uncertainty. The results of experiments highlight the impact of model size, input form, and RLHF on hallucination mitigation. The study concludes that the proposed method provides a feasible way of assessing hallucination in LLMs. The dataset and code are available at https://github.com/Yuki-Asuuna/UMWP.
Reach us at info@study.space