[slides] When Do LLMs Need Retrieval Augmentation%3F Mitigating LLMs' Overconfidence Helps Retrieval Augmentation

This paper explores the issue of overconfidence in Large Language Models (LLMs) and proposes methods to enhance their perception of their knowledge boundaries, particularly in the context of Retrieval Augmentation (RA). The authors measure LLMs' ability to perceive their factual knowledge boundaries and find that overconfidence is the primary reason for their poor performance. They investigate the correlation between LLMs' certainty about their internal knowledge and their reliance on external information, observing that LLMs tend to rely more on external documents when they express uncertainty. To address this, the authors propose several methods to mitigate overconfidence, including "Punish," "Challenge," "Think-Step-by-Step," "Generate," and "Explain." These methods are designed to urge LLMs to be more prudent and improve their accuracy. The effectiveness of these methods is evaluated through experiments on two open-domain QA benchmarks, Natural Questions (NQ) and HotpotQA. The results show that the proposed methods can enhance LLMs' perception of their knowledge boundaries, leading to better performance in adaptive retrieval augmentation with fewer retrieval calls. The paper concludes by discussing the main contributions, limitations, and ethical considerations of the work.This paper explores the issue of overconfidence in Large Language Models (LLMs) and proposes methods to enhance their perception of their knowledge boundaries, particularly in the context of Retrieval Augmentation (RA). The authors measure LLMs' ability to perceive their factual knowledge boundaries and find that overconfidence is the primary reason for their poor performance. They investigate the correlation between LLMs' certainty about their internal knowledge and their reliance on external information, observing that LLMs tend to rely more on external documents when they express uncertainty. To address this, the authors propose several methods to mitigate overconfidence, including "Punish," "Challenge," "Think-Step-by-Step," "Generate," and "Explain." These methods are designed to urge LLMs to be more prudent and improve their accuracy. The effectiveness of these methods is evaluated through experiments on two open-domain QA benchmarks, Natural Questions (NQ) and HotpotQA. The results show that the proposed methods can enhance LLMs' perception of their knowledge boundaries, leading to better performance in adaptive retrieval augmentation with fewer retrieval calls. The paper concludes by discussing the main contributions, limitations, and ethical considerations of the work.

When Do LLMs Need Retrieval Augmentation? Mitigating LLMs' Overconfidence Helps Retrieval Augmentation

2024-06-11 | Shiyu Ni1,2 Keping Bi1,2 Jiafeng Guo1,2* Xueqi Cheng1,2