[slides] Can Large Language Models Faithfully Express Their Intrinsic Uncertainty in Words%3F

The paper "Can Large Language Models Faithfully Express Their Intrinsic Uncertainty in Words?" by Gal Yona explores the capability of large language models (LLMs) to express their intrinsic uncertainty in natural language. The authors propose the concept of *faithful response uncertainty*, which measures the alignment between the model's intrinsic confidence and the decisiveness with which it conveys its assertions. They formalize this metric and evaluate it on various knowledge-intensive question-answering tasks using leading LLMs such as Gemini and GPT-3.5/4. The results show that modern LLMs struggle to faithfully convey their uncertainty, often answering decisively even when there is significant uncertainty. Prompting models to express uncertainty can induce hedging expressions, but these are not well-aligned with the model's intrinsic uncertainty. The study emphasizes the need for better alignment techniques to improve the trustworthiness of LLMs.The paper "Can Large Language Models Faithfully Express Their Intrinsic Uncertainty in Words?" by Gal Yona explores the capability of large language models (LLMs) to express their intrinsic uncertainty in natural language. The authors propose the concept of *faithful response uncertainty*, which measures the alignment between the model's intrinsic confidence and the decisiveness with which it conveys its assertions. They formalize this metric and evaluate it on various knowledge-intensive question-answering tasks using leading LLMs such as Gemini and GPT-3.5/4. The results show that modern LLMs struggle to faithfully convey their uncertainty, often answering decisively even when there is significant uncertainty. Prompting models to express uncertainty can induce hedging expressions, but these are not well-aligned with the model's intrinsic uncertainty. The study emphasizes the need for better alignment techniques to improve the trustworthiness of LLMs.

Can Large Language Models Faithfully Express Their Intrinsic Uncertainty in Words?

27 May 2024 | Gal Yona, Roe Aharoni, Mor Geva