Measuring classifier performance: a coherent alternative to the area under the ROC curve

Measuring classifier performance: a coherent alternative to the area under the ROC curve

2009 | David J. Hand
The paper discusses the limitations of the Area Under the ROC Curve (AUC) as a measure of classifier performance, particularly its incoherence in terms of misclassification costs. The AUC is widely used due to its objectivity and intuitive interpretations, but it can lead to misleading results when ROC curves cross. More fundamentally, the AUC uses different misclassification cost distributions for different classifiers, making it equivalent to using different metrics to evaluate different classification rules. This is problematic because the relative severities of different types of misclassifications should be determined by the problem itself, not the chosen classifier. The paper proposes an alternative measure called the \(H\) measure, which addresses this issue by using a symmetric beta distribution as the weight function. The \(H\) measure is designed to be more coherent and comparable across different classifiers, providing a more reliable way to evaluate their performance. The paper also discusses the estimation of the \(H\) measure and provides empirical results to demonstrate its effectiveness.The paper discusses the limitations of the Area Under the ROC Curve (AUC) as a measure of classifier performance, particularly its incoherence in terms of misclassification costs. The AUC is widely used due to its objectivity and intuitive interpretations, but it can lead to misleading results when ROC curves cross. More fundamentally, the AUC uses different misclassification cost distributions for different classifiers, making it equivalent to using different metrics to evaluate different classification rules. This is problematic because the relative severities of different types of misclassifications should be determined by the problem itself, not the chosen classifier. The paper proposes an alternative measure called the \(H\) measure, which addresses this issue by using a symmetric beta distribution as the weight function. The \(H\) measure is designed to be more coherent and comparable across different classifiers, providing a more reliable way to evaluate their performance. The paper also discusses the estimation of the \(H\) measure and provides empirical results to demonstrate its effectiveness.
Reach us at info@study.space