Can Large Language Models Faithfully Express Their Intrinsic Uncertainty in Words?
Large language models (LLMs) often fail to express their intrinsic uncertainty in natural language, which can lead to users relying on false information. This paper investigates whether LLMs can accurately convey their uncertainty in their responses. The authors propose a metric called "faithful response uncertainty," which measures the gap between the model's intrinsic confidence in its assertions and the decisiveness with which they are conveyed. This metric evaluates whether the model's response reflects its true level of uncertainty.
The study evaluates several LLMs, including Gemini Ultra, GPT-3.5, and GPT-4, on two knowledge-intensive question answering tasks: PopQA and Natural Questions. The results show that LLMs generally respond decisively even when they are uncertain, indicating a lack of alignment between their intrinsic confidence and the decisiveness of their responses. When prompted to express uncertainty, the models sometimes generate hedged responses, but these are not always aligned with their true level of uncertainty.
The authors propose a method to measure decisiveness and uncertainty by prompting a "judge" LLM to evaluate the confidence and decisiveness of model responses. The results show that the judge LLM's scores correlate well with human judgments. However, the study finds that LLMs struggle to express uncertainty in natural language, which limits their trustworthiness.
The paper concludes that modern LLMs are poor at faithfully conveying their uncertainty, and that better alignment techniques are needed to improve their reliability. The authors emphasize the importance of expressing intrinsic uncertainty in natural language to ensure that users are not overly reliant on potentially false information.Can Large Language Models Faithfully Express Their Intrinsic Uncertainty in Words?
Large language models (LLMs) often fail to express their intrinsic uncertainty in natural language, which can lead to users relying on false information. This paper investigates whether LLMs can accurately convey their uncertainty in their responses. The authors propose a metric called "faithful response uncertainty," which measures the gap between the model's intrinsic confidence in its assertions and the decisiveness with which they are conveyed. This metric evaluates whether the model's response reflects its true level of uncertainty.
The study evaluates several LLMs, including Gemini Ultra, GPT-3.5, and GPT-4, on two knowledge-intensive question answering tasks: PopQA and Natural Questions. The results show that LLMs generally respond decisively even when they are uncertain, indicating a lack of alignment between their intrinsic confidence and the decisiveness of their responses. When prompted to express uncertainty, the models sometimes generate hedged responses, but these are not always aligned with their true level of uncertainty.
The authors propose a method to measure decisiveness and uncertainty by prompting a "judge" LLM to evaluate the confidence and decisiveness of model responses. The results show that the judge LLM's scores correlate well with human judgments. However, the study finds that LLMs struggle to express uncertainty in natural language, which limits their trustworthiness.
The paper concludes that modern LLMs are poor at faithfully conveying their uncertainty, and that better alignment techniques are needed to improve their reliability. The authors emphasize the importance of expressing intrinsic uncertainty in natural language to ensure that users are not overly reliant on potentially false information.