Understanding Spectral clustering and the high-dimensional stochastic blockmodel

The paper "Spectral Clustering and the High-Dimensional Stochastic Blockmodel" by Karl Rohe, Sourav Chatterjee, and Bin Yu explores the effectiveness of spectral clustering in identifying communities or clusters within networks. Spectral clustering is a popular and computationally efficient method for discovering these communities, particularly in the context of the Stochastic Blockmodel, a social network model with well-defined communities. The authors bound the number of nodes that spectral clustering might miscluster, providing the first asymptotic results that allow the number of clusters to grow with the number of nodes, a significant advancement in the field. The paper begins by introducing the concept of spectral clustering and its applications in various fields, emphasizing its popularity and computational feasibility. It then delves into the Stochastic Blockmodel, a specific type of random network model characterized by well-defined communities. The authors show that under the more general latent space model, the eigenvectors of the normalized graph Laplacian converge to the eigenvectors of a "population" normalized graph Laplacian. This convergence has implications for graph visualization techniques. The main technical results of the paper are presented in two sections. The first section studies the latent space model, proving that the eigenvectors of the normalized graph Laplacian converge to the population eigenvectors. The second section applies these results to the high-dimensional Stochastic Blockmodel, showing that spectral clustering can correctly identify the blocks in the model even when the number of blocks grows with the number of nodes. The authors also provide simulation results to illustrate how the asymptotic bounds in the paper can guide finite sample results. These simulations suggest that spectral clustering is consistent in certain scenarios and that the eigengap assumption is crucial for the theoretical results. However, the simulations also highlight some limitations, such as the need for a stronger condition on the minimum expected degree. Overall, the paper contributes to the understanding of spectral clustering in high-dimensional settings and provides a benchmark for evaluating the performance of clustering algorithms in complex networks.The paper "Spectral Clustering and the High-Dimensional Stochastic Blockmodel" by Karl Rohe, Sourav Chatterjee, and Bin Yu explores the effectiveness of spectral clustering in identifying communities or clusters within networks. Spectral clustering is a popular and computationally efficient method for discovering these communities, particularly in the context of the Stochastic Blockmodel, a social network model with well-defined communities. The authors bound the number of nodes that spectral clustering might miscluster, providing the first asymptotic results that allow the number of clusters to grow with the number of nodes, a significant advancement in the field. The paper begins by introducing the concept of spectral clustering and its applications in various fields, emphasizing its popularity and computational feasibility. It then delves into the Stochastic Blockmodel, a specific type of random network model characterized by well-defined communities. The authors show that under the more general latent space model, the eigenvectors of the normalized graph Laplacian converge to the eigenvectors of a "population" normalized graph Laplacian. This convergence has implications for graph visualization techniques. The main technical results of the paper are presented in two sections. The first section studies the latent space model, proving that the eigenvectors of the normalized graph Laplacian converge to the population eigenvectors. The second section applies these results to the high-dimensional Stochastic Blockmodel, showing that spectral clustering can correctly identify the blocks in the model even when the number of blocks grows with the number of nodes. The authors also provide simulation results to illustrate how the asymptotic bounds in the paper can guide finite sample results. These simulations suggest that spectral clustering is consistent in certain scenarios and that the eigengap assumption is crucial for the theoretical results. However, the simulations also highlight some limitations, such as the need for a stronger condition on the minimum expected degree. Overall, the paper contributes to the understanding of spectral clustering in high-dimensional settings and provides a benchmark for evaluating the performance of clustering algorithms in complex networks.

SPECTRAL CLUSTERING AND THE HIGH-DIMENSIONAL STOCHASTIC BLOCKMODEL

2011, Vol. 39, No. 4, 1878–1915 | BY KARL ROHE, SOURAV CHATTERJEE AND BIN YU