27 Jun 2024 | Shayan Kiyani*, George Pappas*, Hamed Hassani*
This paper introduces Conformal Prediction with Length-Optimization (CPL), a novel framework that constructs prediction sets with near-optimal length while ensuring conditional validity under various covariate shifts. The framework addresses two key challenges in conformal prediction: conditional validity and length efficiency. Conditional validity ensures that prediction sets include the true label with high probability across different subpopulations, while length efficiency aims to minimize the size of prediction sets while maintaining validity.
CPL operates at the third stage of the conformal prediction pipeline, focusing on designing prediction sets given a predictive model and a conformity score. It leverages the structure of the covariate X to optimize prediction set length, which is crucial for informative and non-trivial prediction sets. The framework is based on a minimax formulation that ensures conditional validity and optimizes length. In the infinite sample regime, strong duality results show that CPL achieves optimal length and validity. In the finite sample regime, CPL constructs conditionally valid prediction sets.
The paper provides extensive empirical evaluations across diverse real-world and synthetic datasets in classification, regression, and text-related settings. CPL outperforms state-of-the-art methods in terms of prediction set size, achieving superior length efficiency. The framework is shown to maintain conditional validity across different levels of covariate shifts, including marginal and group-conditional coverage. Theoretical guarantees are provided for the finite sample setting, ensuring conditional coverage validity up to finite-sample error terms. CPL is demonstrated to significantly improve length efficiency in real-world applications compared to existing methods.This paper introduces Conformal Prediction with Length-Optimization (CPL), a novel framework that constructs prediction sets with near-optimal length while ensuring conditional validity under various covariate shifts. The framework addresses two key challenges in conformal prediction: conditional validity and length efficiency. Conditional validity ensures that prediction sets include the true label with high probability across different subpopulations, while length efficiency aims to minimize the size of prediction sets while maintaining validity.
CPL operates at the third stage of the conformal prediction pipeline, focusing on designing prediction sets given a predictive model and a conformity score. It leverages the structure of the covariate X to optimize prediction set length, which is crucial for informative and non-trivial prediction sets. The framework is based on a minimax formulation that ensures conditional validity and optimizes length. In the infinite sample regime, strong duality results show that CPL achieves optimal length and validity. In the finite sample regime, CPL constructs conditionally valid prediction sets.
The paper provides extensive empirical evaluations across diverse real-world and synthetic datasets in classification, regression, and text-related settings. CPL outperforms state-of-the-art methods in terms of prediction set size, achieving superior length efficiency. The framework is shown to maintain conditional validity across different levels of covariate shifts, including marginal and group-conditional coverage. Theoretical guarantees are provided for the finite sample setting, ensuring conditional coverage validity up to finite-sample error terms. CPL is demonstrated to significantly improve length efficiency in real-world applications compared to existing methods.