[slides and audio] Evaluation methods for topic models

This paper evaluates methods for estimating the probability of held-out documents in topic models, focusing on Latent Dirichlet Allocation (LDA). The authors demonstrate that commonly used methods, such as the harmonic mean method and empirical likelihood method, are inaccurate and have high variance. They propose two alternative methods: a Chib-style estimator and a "left-to-right" evaluation algorithm, which are both accurate and efficient. LDA is a generative model for text where each document is a mixture of topics, and each topic is a distribution over words. The model uses Dirichlet priors to define the distribution of topics and document-specific topic distributions. The probability of a held-out document can be estimated by integrating over the topic assignments and using Gibbs sampling or MCMC methods to approximate the posterior distribution. The paper evaluates several methods for estimating the probability of held-out documents, including importance sampling, the harmonic mean method, annealed importance sampling (AIS), and the Chib-style estimator. The authors show that the harmonic mean method is unstable and often overestimates the probability of held-out documents. AIS and the Chib-style estimator are more accurate, with AIS being computationally intensive but generally more accurate. The "left-to-right" algorithm is also effective, particularly for document completion tasks where the probability of the second half of a document is estimated given the first half. The authors compare the performance of these methods on both synthetic and real-world data sets, finding that the Chib-style estimator and the "left-to-right" algorithm perform well, with the latter consistently outperforming the former on real-world data. The "left-to-right" algorithm is particularly effective for document completion tasks, where it provides accurate estimates with lower computational cost compared to AIS. The paper concludes that the Chib-style estimator and the "left-to-right" algorithm provide a clear and accurate methodology for evaluating topic models, offering a more reliable alternative to the commonly used methods. The results highlight the importance of accurate evaluation methods in topic modeling, as they can significantly impact the selection and comparison of different models.This paper evaluates methods for estimating the probability of held-out documents in topic models, focusing on Latent Dirichlet Allocation (LDA). The authors demonstrate that commonly used methods, such as the harmonic mean method and empirical likelihood method, are inaccurate and have high variance. They propose two alternative methods: a Chib-style estimator and a "left-to-right" evaluation algorithm, which are both accurate and efficient. LDA is a generative model for text where each document is a mixture of topics, and each topic is a distribution over words. The model uses Dirichlet priors to define the distribution of topics and document-specific topic distributions. The probability of a held-out document can be estimated by integrating over the topic assignments and using Gibbs sampling or MCMC methods to approximate the posterior distribution. The paper evaluates several methods for estimating the probability of held-out documents, including importance sampling, the harmonic mean method, annealed importance sampling (AIS), and the Chib-style estimator. The authors show that the harmonic mean method is unstable and often overestimates the probability of held-out documents. AIS and the Chib-style estimator are more accurate, with AIS being computationally intensive but generally more accurate. The "left-to-right" algorithm is also effective, particularly for document completion tasks where the probability of the second half of a document is estimated given the first half. The authors compare the performance of these methods on both synthetic and real-world data sets, finding that the Chib-style estimator and the "left-to-right" algorithm perform well, with the latter consistently outperforming the former on real-world data. The "left-to-right" algorithm is particularly effective for document completion tasks, where it provides accurate estimates with lower computational cost compared to AIS. The paper concludes that the Chib-style estimator and the "left-to-right" algorithm provide a clear and accurate methodology for evaluating topic models, offering a more reliable alternative to the commonly used methods. The results highlight the importance of accurate evaluation methods in topic modeling, as they can significantly impact the selection and comparison of different models.

Evaluation Methods for Topic Models

2009 | Hanna M. Wallach, Iain Murray, Ruslan Salakhutdinov, David Mimno