[slides and audio] topicmodels%3A An R Package for Fitting Topic Models

The article introduces the *topicmodels* R package, which provides infrastructure for fitting topic models based on text mining data structures from the *tm* package. The package supports two main algorithms for fitting topic models: variational expectation-maximization (VEM) and Gibbs sampling. It includes interfaces to the code for fitting Latent Dirichlet Allocation (LDA) and the Correlated Topics Model (CTM) using VEM, as well as LDA with Gibbs sampling. The package builds on the *tm* package, which provides text mining functionalities such as corpus construction and transformation to document-term matrices. The article outlines the specification and estimation of LDA and CTM models, including the generative process, maximum likelihood estimation, and variational inference. It also discusses preprocessing steps, model selection, and inference methods. An illustrative example using abstracts from the *Journal of Statistical Software* demonstrates how to fit LDA and CTM models, evaluate their performance, and interpret the results. The package aims to extend the capabilities of text mining in R and facilitate the use of advanced topic modeling techniques.The article introduces the *topicmodels* R package, which provides infrastructure for fitting topic models based on text mining data structures from the *tm* package. The package supports two main algorithms for fitting topic models: variational expectation-maximization (VEM) and Gibbs sampling. It includes interfaces to the code for fitting Latent Dirichlet Allocation (LDA) and the Correlated Topics Model (CTM) using VEM, as well as LDA with Gibbs sampling. The package builds on the *tm* package, which provides text mining functionalities such as corpus construction and transformation to document-term matrices. The article outlines the specification and estimation of LDA and CTM models, including the generative process, maximum likelihood estimation, and variational inference. It also discusses preprocessing steps, model selection, and inference methods. An illustrative example using abstracts from the *Journal of Statistical Software* demonstrates how to fit LDA and CTM models, evaluate their performance, and interpret the results. The package aims to extend the capabilities of text mining in R and facilitate the use of advanced topic modeling techniques.

topicmodels: An R Package for Fitting Topic Models

2011 | Bettina Grün, Kurt Hornik