Understanding The relationship between Precision-Recall and ROC curves

The paper explores the relationship between Precision-Recall (PR) curves and Receiver Operator Characteristic (ROC) curves, particularly in the context of highly skewed datasets. The authors demonstrate that there is a deep connection between these two types of curves, such that a curve dominates in ROC space if and only if it dominates in PR space. They introduce the concept of an "achievable PR curve," which is analogous to the convex hull in ROC space, and show that it can be efficiently computed. The paper also highlights differences in the two types of curves, such as the incorrectness of linear interpolation in PR space and the fact that optimizing the area under the ROC curve does not guarantee optimization of the area under the PR curve. The authors provide a detailed mathematical proof of these relationships and discuss the implications for algorithm design and evaluation.The paper explores the relationship between Precision-Recall (PR) curves and Receiver Operator Characteristic (ROC) curves, particularly in the context of highly skewed datasets. The authors demonstrate that there is a deep connection between these two types of curves, such that a curve dominates in ROC space if and only if it dominates in PR space. They introduce the concept of an "achievable PR curve," which is analogous to the convex hull in ROC space, and show that it can be efficiently computed. The paper also highlights differences in the two types of curves, such as the incorrectness of linear interpolation in PR space and the fact that optimizing the area under the ROC curve does not guarantee optimization of the area under the PR curve. The authors provide a detailed mathematical proof of these relationships and discuss the implications for algorithm design and evaluation.

The Relationship Between Precision-Recall and ROC Curves

2006 | Jesse Davis, Mark Goadrich