Using Bayesian Model Averaging to Calibrate Forecast Ensembles

Using Bayesian Model Averaging to Calibrate Forecast Ensembles

May 2005 | Adrian E. Raftery, Tilmann Gneiting, Fadoua Balabdaoui, and Michael Polakowski
This paper proposes a statistical method for postprocessing ensembles used in probabilistic weather forecasting based on Bayesian model averaging (BMA). BMA combines predictive distributions from different sources by weighting them according to their posterior probabilities of generating the forecasts. The BMA predictive probability density function (PDF) is a weighted average of PDFs centered on individual bias-corrected forecasts, with weights reflecting the models' relative contributions to predictive skill. BMA weights can assess the usefulness of ensemble members and help select them, which is useful given the cost of running large ensembles. The BMA PDF can be represented as an unweighted ensemble of any desired size by simulating from the BMA predictive distribution. The BMA predictive variance can be decomposed into two components: between-forecast variability and within-forecast variability. Predictive PDFs or intervals based solely on the ensemble spread incorporate the first component but not the second. Thus, BMA provides a theoretical explanation of the tendency of ensembles to exhibit a spread-error correlation but yet be underdispersive. The method was applied to 48-h forecasts of surface temperature in the Pacific Northwest in January–June 2000 using the University of Washington fifth-generation Pennsylvania State University–NCAR Mesoscale Model (MM5) ensemble. The predictive PDFs were much better calibrated than the raw ensemble, and the BMA forecasts were sharp in that 90% BMA prediction intervals were 66% shorter on average than those produced by sample climatology. As a by-product, BMA yields a deterministic point forecast, and this had root-mean-square errors 7% lower than the best of the ensemble members and 8% lower than the ensemble mean. Similar results were obtained for forecasts of sea level pressure. Simulation experiments show that BMA performs reasonably well when the underlying ensemble is calibrated, or even overdispersed. The paper discusses the application of BMA to short-range mesoscale forecasting, including the University of Washington mesoscale short-range ensemble system for the Pacific Northwest. It describes the BMA approach, its estimation via maximum likelihood, the EM algorithm, and minimum CRPS estimation. The BMA predictive variance decomposition and the spread-error correlation are discussed, along with an example of BMA predictive PDF. The results show that BMA provides calibrated and sharp predictive PDFs, with BMA intervals being much narrower than those from sample climatology. The BMA weights can be used to select ensemble members, and the results suggest that removing the least useful member can improve performance without significant loss. The paper also discusses results for sea level pressure and experiments with simulated ensembles, showing that BMA performs well with both calibrated and underdispersive ensembles.This paper proposes a statistical method for postprocessing ensembles used in probabilistic weather forecasting based on Bayesian model averaging (BMA). BMA combines predictive distributions from different sources by weighting them according to their posterior probabilities of generating the forecasts. The BMA predictive probability density function (PDF) is a weighted average of PDFs centered on individual bias-corrected forecasts, with weights reflecting the models' relative contributions to predictive skill. BMA weights can assess the usefulness of ensemble members and help select them, which is useful given the cost of running large ensembles. The BMA PDF can be represented as an unweighted ensemble of any desired size by simulating from the BMA predictive distribution. The BMA predictive variance can be decomposed into two components: between-forecast variability and within-forecast variability. Predictive PDFs or intervals based solely on the ensemble spread incorporate the first component but not the second. Thus, BMA provides a theoretical explanation of the tendency of ensembles to exhibit a spread-error correlation but yet be underdispersive. The method was applied to 48-h forecasts of surface temperature in the Pacific Northwest in January–June 2000 using the University of Washington fifth-generation Pennsylvania State University–NCAR Mesoscale Model (MM5) ensemble. The predictive PDFs were much better calibrated than the raw ensemble, and the BMA forecasts were sharp in that 90% BMA prediction intervals were 66% shorter on average than those produced by sample climatology. As a by-product, BMA yields a deterministic point forecast, and this had root-mean-square errors 7% lower than the best of the ensemble members and 8% lower than the ensemble mean. Similar results were obtained for forecasts of sea level pressure. Simulation experiments show that BMA performs reasonably well when the underlying ensemble is calibrated, or even overdispersed. The paper discusses the application of BMA to short-range mesoscale forecasting, including the University of Washington mesoscale short-range ensemble system for the Pacific Northwest. It describes the BMA approach, its estimation via maximum likelihood, the EM algorithm, and minimum CRPS estimation. The BMA predictive variance decomposition and the spread-error correlation are discussed, along with an example of BMA predictive PDF. The results show that BMA provides calibrated and sharp predictive PDFs, with BMA intervals being much narrower than those from sample climatology. The BMA weights can be used to select ensemble members, and the results suggest that removing the least useful member can improve performance without significant loss. The paper also discusses results for sea level pressure and experiments with simulated ensembles, showing that BMA performs well with both calibrated and underdispersive ensembles.
Reach us at info@study.space
[slides and audio] Using Bayesian Model Averaging to Calibrate Forecast Ensembles