This tutorial introduces conformal prediction, a method for producing precise confidence intervals for new predictions based on past experience. Conformal prediction can be applied to any method for making predictions, including nearest-neighbor methods, support-vector machines, ridge regression, and others. It is designed for an online setting where labels are predicted successively, with each label revealed before the next is predicted. The key feature of conformal prediction is that if examples are sampled independently from the same distribution, predictions will be correct 1 - ε of the time, even though they are based on an accumulating dataset rather than independent datasets.
Conformal prediction uses a nonconformity measure to determine how unusual an example is relative to previous examples. Given a nonconformity measure, the conformal algorithm produces a prediction region Γ^ε for every probability of error ε. The region Γ^ε is a (1 - ε)-prediction region that contains the true label with probability at least 1 - ε. These regions are nested, with larger regions for smaller ε values.
The validity of conformal prediction is based on the concept of exchangeability, where examples are drawn independently from a probability distribution. Under exchangeability, conformal prediction is valid in the sense that predictions will be correct 1 - ε of the time. This is demonstrated in the case of normally distributed examples, where Fisher's prediction interval is valid under exchangeability.
The tutorial also discusses the efficiency of conformal prediction, which depends on the probability distribution and the nonconformity measure used. In classification, a 95% prediction region should be small enough to contain only the predicted label. In regression, it should be a narrow interval around the predicted value.
The tutorial provides a detailed example of Fisher's prediction interval for normally distributed examples, showing how it can be used to predict a new value based on previous examples. It also discusses the difference between confidence and full-fledged conditional probability, emphasizing that confidence intervals are valid under exchangeability but not necessarily under other assumptions.
The tutorial concludes with a discussion of the law of large numbers for exchangeable sequences, which ensures that a high proportion of predictions will be correct. It also discusses the implications of the failure of the exchangeability assumption and the importance of choosing an appropriate nonconformity measure for efficient prediction.This tutorial introduces conformal prediction, a method for producing precise confidence intervals for new predictions based on past experience. Conformal prediction can be applied to any method for making predictions, including nearest-neighbor methods, support-vector machines, ridge regression, and others. It is designed for an online setting where labels are predicted successively, with each label revealed before the next is predicted. The key feature of conformal prediction is that if examples are sampled independently from the same distribution, predictions will be correct 1 - ε of the time, even though they are based on an accumulating dataset rather than independent datasets.
Conformal prediction uses a nonconformity measure to determine how unusual an example is relative to previous examples. Given a nonconformity measure, the conformal algorithm produces a prediction region Γ^ε for every probability of error ε. The region Γ^ε is a (1 - ε)-prediction region that contains the true label with probability at least 1 - ε. These regions are nested, with larger regions for smaller ε values.
The validity of conformal prediction is based on the concept of exchangeability, where examples are drawn independently from a probability distribution. Under exchangeability, conformal prediction is valid in the sense that predictions will be correct 1 - ε of the time. This is demonstrated in the case of normally distributed examples, where Fisher's prediction interval is valid under exchangeability.
The tutorial also discusses the efficiency of conformal prediction, which depends on the probability distribution and the nonconformity measure used. In classification, a 95% prediction region should be small enough to contain only the predicted label. In regression, it should be a narrow interval around the predicted value.
The tutorial provides a detailed example of Fisher's prediction interval for normally distributed examples, showing how it can be used to predict a new value based on previous examples. It also discusses the difference between confidence and full-fledged conditional probability, emphasizing that confidence intervals are valid under exchangeability but not necessarily under other assumptions.
The tutorial concludes with a discussion of the law of large numbers for exchangeable sequences, which ensures that a high proportion of predictions will be correct. It also discusses the implications of the failure of the exchangeability assumption and the importance of choosing an appropriate nonconformity measure for efficient prediction.