On Calibration of Modern Neural Networks

On Calibration of Modern Neural Networks

3 Aug 2017 | Chuan Guo * 1 Geoff Pleiss * 1 Yu Sun * 1 Kilian Q. Weinberger 1
Modern neural networks are found to be poorly calibrated, meaning their predicted probabilities do not accurately reflect the true likelihood of correctness. This issue is observed despite significant improvements in network accuracy over the past decade. Factors such as network depth, width, weight decay, and Batch Normalization influence calibration. Post-processing methods like temperature scaling, a single-parameter variant of Platt Scaling, are effective in calibrating predictions across various architectures and datasets. Calibration is crucial for reliable decision-making in applications like autonomous driving and healthcare, where confidence in predictions is essential. The paper evaluates different calibration techniques, showing that temperature scaling is surprisingly effective and easy to implement. It also highlights that increased model capacity and reduced regularization contribute to miscalibration. Calibration methods such as histogram binning, isotonic regression, and Platt scaling are discussed, with temperature scaling emerging as the most effective and straightforward approach. The study provides insights into neural network learning and offers practical guidance for improving calibration in real-world settings.Modern neural networks are found to be poorly calibrated, meaning their predicted probabilities do not accurately reflect the true likelihood of correctness. This issue is observed despite significant improvements in network accuracy over the past decade. Factors such as network depth, width, weight decay, and Batch Normalization influence calibration. Post-processing methods like temperature scaling, a single-parameter variant of Platt Scaling, are effective in calibrating predictions across various architectures and datasets. Calibration is crucial for reliable decision-making in applications like autonomous driving and healthcare, where confidence in predictions is essential. The paper evaluates different calibration techniques, showing that temperature scaling is surprisingly effective and easy to implement. It also highlights that increased model capacity and reduced regularization contribute to miscalibration. Calibration methods such as histogram binning, isotonic regression, and Platt scaling are discussed, with temperature scaling emerging as the most effective and straightforward approach. The study provides insights into neural network learning and offers practical guidance for improving calibration in real-world settings.
Reach us at info@study.space