Understanding Obtaining Well Calibrated Probabilities Using Bayesian Binning

The paper introduces a new non-parametric calibration method called Bayesian Binning into Quantiles (BBQ) for binary classification problems. BBQ addresses the limitations of existing calibration methods by post-processing the output of a binary classification algorithm, making it compatible with various classification models. The method is computationally efficient and empirically accurate, as demonstrated through experiments on both real and simulated datasets. The authors compare BBQ with other calibration methods, including histogram binning, Platt scaling, and isotonic regression. BBQ outperforms these methods in terms of calibration, as measured by the Expected Calibration Error (ECE) and Maximum Calibration Error (MCE), while maintaining or improving discrimination performance, as measured by the Area Under the ROC Curve (AUC) and Accuracy (ACC). The BBQ method combines multiple binning models using a Bayesian score derived from the BDeu score, which is used for learning Bayesian network structures. This approach allows for more robust calibrated predictions by considering multiple different binnings and their combinations. The experimental results show that BBQ performs competitively with other methods in terms of discrimination and often outperforms them in terms of calibration. The authors recommend using BBQ for post-processing binary predictions to improve the calibration of models.The paper introduces a new non-parametric calibration method called Bayesian Binning into Quantiles (BBQ) for binary classification problems. BBQ addresses the limitations of existing calibration methods by post-processing the output of a binary classification algorithm, making it compatible with various classification models. The method is computationally efficient and empirically accurate, as demonstrated through experiments on both real and simulated datasets. The authors compare BBQ with other calibration methods, including histogram binning, Platt scaling, and isotonic regression. BBQ outperforms these methods in terms of calibration, as measured by the Expected Calibration Error (ECE) and Maximum Calibration Error (MCE), while maintaining or improving discrimination performance, as measured by the Area Under the ROC Curve (AUC) and Accuracy (ACC). The BBQ method combines multiple binning models using a Bayesian score derived from the BDeu score, which is used for learning Bayesian network structures. This approach allows for more robust calibrated predictions by considering multiple different binnings and their combinations. The experimental results show that BBQ performs competitively with other methods in terms of discrimination and often outperforms them in terms of calibration. The authors recommend using BBQ for post-processing binary predictions to improve the calibration of models.

Obtaining Well Calibrated Probabilities Using Bayesian Binning

2015 | Mahdi Pakdaman Naeini, Gregory F. Cooper, and Milos Hauskrecht