[slides and audio] Large language model validity via enhanced conformal prediction methods

The paper addresses the challenge of ensuring the validity of outputs from large language models (LLMs) by developing new conformal inference methods. Prior work in conformal language modeling identifies a subset of text that is guaranteed to be correct with high probability, but these methods suffer from two main issues: they do not provide conditionally valid guarantees and can remove valuable claims due to imperfect scoring functions. To address these challenges, the authors propose two new methods: 1. **Conditional Boosting**: This method generalizes the conditional conformal procedure to adaptively issue weaker guarantees when necessary to preserve the utility of the output. It automates the discovery of superior claim scoring functions by differentiating through the conditional conformal algorithm, allowing for the optimization of the scoring function to maximize claim retention. 2. **Level-Adaptive Conformal Prediction**: This method allows the validity of the conformal output to depend on the characteristics of the queried prompt. It adaptively adjusts the level (the claimed probability of correctness) to ensure that at least 70% of the original claims are retained, while still achieving the desired level of factuality. The authors demonstrate the effectiveness of these methods on both synthetic and real-world datasets, showing that they provide improved claim retention and non-trivial guarantees of response factuality compared to existing methods. The paper also includes a detailed theoretical analysis and experimental results to support the proposed methods.The paper addresses the challenge of ensuring the validity of outputs from large language models (LLMs) by developing new conformal inference methods. Prior work in conformal language modeling identifies a subset of text that is guaranteed to be correct with high probability, but these methods suffer from two main issues: they do not provide conditionally valid guarantees and can remove valuable claims due to imperfect scoring functions. To address these challenges, the authors propose two new methods: 1. **Conditional Boosting**: This method generalizes the conditional conformal procedure to adaptively issue weaker guarantees when necessary to preserve the utility of the output. It automates the discovery of superior claim scoring functions by differentiating through the conditional conformal algorithm, allowing for the optimization of the scoring function to maximize claim retention. 2. **Level-Adaptive Conformal Prediction**: This method allows the validity of the conformal output to depend on the characteristics of the queried prompt. It adaptively adjusts the level (the claimed probability of correctness) to ensure that at least 70% of the original claims are retained, while still achieving the desired level of factuality. The authors demonstrate the effectiveness of these methods on both synthetic and real-world datasets, showing that they provide improved claim retention and non-trivial guarantees of response factuality compared to existing methods. The paper also includes a detailed theoretical analysis and experimental results to support the proposed methods.

Large language model validity via enhanced conformal prediction methods

14 Jun 2024 | John J. Cherian, Isaac Gibbs, Emmanuel J. Candès