Supervised Topic Models

Supervised Topic Models

3 Mar 2010 | David M. Blei, Jon D. McAuliffe
The paper introduces supervised latent Dirichlet allocation (sLDA), a statistical model for labeled documents that can handle various response types. sLDA extends the unsupervised Latent Dirichlet Allocation (LDA) by incorporating response variables, allowing for the prediction of these variables based on document content. The authors derive an approximate maximum-likelihood procedure using variational methods to handle intractable posterior expectations. They demonstrate sLDA on two real-world problems: predicting movie ratings from reviews and predicting the political tone of amendments in the U.S. Senate based on their text. The results show that sLDA outperforms modern regularized regression and unsupervised LDA followed by regression in terms of predictive power. The paper also discusses the computational aspects of sLDA, including posterior inference, parameter estimation, and prediction, and provides specific algorithms for Gaussian and Poisson responses. Finally, the authors suggest a general approach for handling other exponential family responses using the delta method.The paper introduces supervised latent Dirichlet allocation (sLDA), a statistical model for labeled documents that can handle various response types. sLDA extends the unsupervised Latent Dirichlet Allocation (LDA) by incorporating response variables, allowing for the prediction of these variables based on document content. The authors derive an approximate maximum-likelihood procedure using variational methods to handle intractable posterior expectations. They demonstrate sLDA on two real-world problems: predicting movie ratings from reviews and predicting the political tone of amendments in the U.S. Senate based on their text. The results show that sLDA outperforms modern regularized regression and unsupervised LDA followed by regression in terms of predictive power. The paper also discusses the computational aspects of sLDA, including posterior inference, parameter estimation, and prediction, and provides specific algorithms for Gaussian and Poisson responses. Finally, the authors suggest a general approach for handling other exponential family responses using the delta method.
Reach us at info@study.space
[slides and audio] Supervised Topic Models