Understanding Scene Classification Via pLSA

The paper "Scene Classification Via pLSA" by Anna Bosch, Andrew Zisserman, and Xavier Muñoz introduces a novel method for scene classification using probabilistic Latent Semantic Analysis (pLSA). The authors aim to discover objects in images without supervision and use this object distribution for scene classification. They apply pLSA, a generative model originally from statistical text analysis, to a bag-of-visual-words representation of images. The classification is performed using a k-nearest neighbor classifier. The paper investigates the impact of changes in the visual vocabulary and the number of latent topics learned, and develops a new vocabulary using color SIFT descriptors. The classification performance is compared to supervised approaches by Vogel and Schiele, Oliva and Torralba, and a semi-supervised approach by Fei Fei and Perona. The results show that the combination of unsupervised pLSA followed by supervised nearest neighbor classification achieves superior performance. The introduction highlights the challenges in scene classification due to variability, ambiguity, and varying conditions. The authors discuss two basic strategies: using low-level features and using intermediate representations. They draw inspiration from previous works on pLSA, sparse features, dense SIFT, and semi-supervised LDA. The paper also compares their method to these previous methods, demonstrating superior performance in all cases. The pLSA model is described in detail, where images are treated as documents and object categories as topics. The co-occurrence table of counts and the latent variable model are explained, providing a statistical framework for clustering multiple object categories per image.The paper "Scene Classification Via pLSA" by Anna Bosch, Andrew Zisserman, and Xavier Muñoz introduces a novel method for scene classification using probabilistic Latent Semantic Analysis (pLSA). The authors aim to discover objects in images without supervision and use this object distribution for scene classification. They apply pLSA, a generative model originally from statistical text analysis, to a bag-of-visual-words representation of images. The classification is performed using a k-nearest neighbor classifier. The paper investigates the impact of changes in the visual vocabulary and the number of latent topics learned, and develops a new vocabulary using color SIFT descriptors. The classification performance is compared to supervised approaches by Vogel and Schiele, Oliva and Torralba, and a semi-supervised approach by Fei Fei and Perona. The results show that the combination of unsupervised pLSA followed by supervised nearest neighbor classification achieves superior performance. The introduction highlights the challenges in scene classification due to variability, ambiguity, and varying conditions. The authors discuss two basic strategies: using low-level features and using intermediate representations. They draw inspiration from previous works on pLSA, sparse features, dense SIFT, and semi-supervised LDA. The paper also compares their method to these previous methods, demonstrating superior performance in all cases. The pLSA model is described in detail, where images are treated as documents and object categories as topics. The co-occurrence table of counts and the latent variable model are explained, providing a statistical framework for clustering multiple object categories per image.

Scene Classification Via pLSA

2006 | Anna Bosch, Andrew Zisserman, and Xavier Muñoz