Improving Marginal Likelihood Estimation for Bayesian Phylogenetic Model Selection

Improving Marginal Likelihood Estimation for Bayesian Phylogenetic Model Selection

2011 | WANGANG XIE, PAUL O. LEWIS, YU FAN, LYNN KUO AND MING-HUI CHEN
This paper introduces a new method, stepping-stone sampling (SS), for estimating the marginal likelihood in Bayesian phylogenetic model selection. The marginal likelihood is a key quantity for comparing models using Bayes factors. The harmonic mean (HM) method, though simple, often overestimates the marginal likelihood. The thermodynamic integration (TI) method is more accurate but computationally intensive. SS offers a balance between accuracy and computational efficiency by using importance sampling to estimate ratios of marginal likelihoods along a path between the prior and posterior distributions. SS is compared with HM and TI in simulations and real data analyses. Results show that SS and TI provide more accurate estimates of the marginal likelihood than HM, which is often significantly less accurate. SS is slightly less computationally intensive than TI and avoids discretization bias. The SS method estimates the marginal likelihood directly, allowing comparisons with published marginal likelihoods of other models, which is not possible with methods that estimate Bayes factors directly unless a common reference model is used. The paper also discusses the importance of priors in Bayesian model selection. Informative priors can prevent models from fitting data well by restricting parameter space. The marginal likelihood accounts for the prior, whereas traditional methods like AIC, BIC, and LRT do not. SS and TI allow for tunable penalties for parameters, whereas traditional methods treat parameters equally. The SS method is shown to be more accurate and efficient than HM, and it performs well in simulations and real data. The paper concludes that SS and TI should be preferred over HM for model selection in phylogenetics due to their higher accuracy, despite the additional computational cost. The SS method is an importance-sampling approach that uses the power posterior as the importance density to estimate the ratio of normalizing constants, making it a viable alternative to TI. The SS method is particularly effective when the number of β intervals is sufficiently large, leading to more accurate estimates of the marginal likelihood.This paper introduces a new method, stepping-stone sampling (SS), for estimating the marginal likelihood in Bayesian phylogenetic model selection. The marginal likelihood is a key quantity for comparing models using Bayes factors. The harmonic mean (HM) method, though simple, often overestimates the marginal likelihood. The thermodynamic integration (TI) method is more accurate but computationally intensive. SS offers a balance between accuracy and computational efficiency by using importance sampling to estimate ratios of marginal likelihoods along a path between the prior and posterior distributions. SS is compared with HM and TI in simulations and real data analyses. Results show that SS and TI provide more accurate estimates of the marginal likelihood than HM, which is often significantly less accurate. SS is slightly less computationally intensive than TI and avoids discretization bias. The SS method estimates the marginal likelihood directly, allowing comparisons with published marginal likelihoods of other models, which is not possible with methods that estimate Bayes factors directly unless a common reference model is used. The paper also discusses the importance of priors in Bayesian model selection. Informative priors can prevent models from fitting data well by restricting parameter space. The marginal likelihood accounts for the prior, whereas traditional methods like AIC, BIC, and LRT do not. SS and TI allow for tunable penalties for parameters, whereas traditional methods treat parameters equally. The SS method is shown to be more accurate and efficient than HM, and it performs well in simulations and real data. The paper concludes that SS and TI should be preferred over HM for model selection in phylogenetics due to their higher accuracy, despite the additional computational cost. The SS method is an importance-sampling approach that uses the power posterior as the importance density to estimate the ratio of normalizing constants, making it a viable alternative to TI. The SS method is particularly effective when the number of β intervals is sufficiently large, leading to more accurate estimates of the marginal likelihood.
Reach us at info@futurestudyspace.com
[slides and audio] Improving marginal likelihood estimation for Bayesian phylogenetic model selection.