24 Jan 2024 | Mark Steyvers, Heliodoro Tejeda, Aakriti Kumar, Catarina Belem, Sheer Karny, Xinyue Hu, Lukas Mayer, Padhraic Smyth
The paper explores the calibration gap between the internal confidence of large language models (LLMs) and human users' perception of this confidence. Through experiments involving multiple-choice questions, the study examines how well users can discern the reliability of LLM outputs. The research focuses on two key areas: assessing users' perception of true LLM confidence and investigating the impact of tailored explanations on this perception. The findings highlight that default explanations from LLMs often lead to overestimation of both the model's confidence and accuracy. By modifying explanations to better reflect the LLM's internal confidence, the study observes a significant shift in user perception, aligning it more closely with the model's actual confidence levels. This adjustment in explanatory approach demonstrates potential for enhancing user trust and accuracy in assessing LLM outputs. The research underscores the importance of transparent communication of confidence levels in LLMs, particularly in high-stakes applications where understanding the reliability of AI-generated information is essential.The paper explores the calibration gap between the internal confidence of large language models (LLMs) and human users' perception of this confidence. Through experiments involving multiple-choice questions, the study examines how well users can discern the reliability of LLM outputs. The research focuses on two key areas: assessing users' perception of true LLM confidence and investigating the impact of tailored explanations on this perception. The findings highlight that default explanations from LLMs often lead to overestimation of both the model's confidence and accuracy. By modifying explanations to better reflect the LLM's internal confidence, the study observes a significant shift in user perception, aligning it more closely with the model's actual confidence levels. This adjustment in explanatory approach demonstrates potential for enhancing user trust and accuracy in assessing LLM outputs. The research underscores the importance of transparent communication of confidence levels in LLMs, particularly in high-stakes applications where understanding the reliability of AI-generated information is essential.