April 26–30, 2010, Raleigh, North Carolina, USA | Jure Leskovec, Kevin J. Lang, Michael W. Mahoney
This paper explores a range of network community detection methods to compare their performance and understand the systematic biases in the clusters they identify. The authors evaluate several common objective functions and approximation algorithms, focusing on a size-resolved version of the optimization problem to examine the structural properties of clusters at different sizes. They compare two graph partitioning algorithms—Local Spectral and Metis+MQI—and several heuristic approaches, such as Graclus and Newman’s Dendrogram. The study reveals that while these algorithms generally optimize the community score function over a range of size scales, there are systematic biases and trade-offs in their performance. For instance, Local Spectral tends to produce more compact clusters but with higher external connectivity, while Metis+MQI finds lower-conductance cuts but with better internal connectivity. The paper also discusses the behavior of various objective functions, showing that while some metrics like conductance and normalized cut exhibit similar behaviors, others like modularity follow different patterns. Finally, the authors compute theoretical lower bounds to assess the quality of the clusters identified by the algorithms, demonstrating that the NCP plot shape is an intrinsic property of large networks rather than an artifact of the algorithms or objective functions.This paper explores a range of network community detection methods to compare their performance and understand the systematic biases in the clusters they identify. The authors evaluate several common objective functions and approximation algorithms, focusing on a size-resolved version of the optimization problem to examine the structural properties of clusters at different sizes. They compare two graph partitioning algorithms—Local Spectral and Metis+MQI—and several heuristic approaches, such as Graclus and Newman’s Dendrogram. The study reveals that while these algorithms generally optimize the community score function over a range of size scales, there are systematic biases and trade-offs in their performance. For instance, Local Spectral tends to produce more compact clusters but with higher external connectivity, while Metis+MQI finds lower-conductance cuts but with better internal connectivity. The paper also discusses the behavior of various objective functions, showing that while some metrics like conductance and normalized cut exhibit similar behaviors, others like modularity follow different patterns. Finally, the authors compute theoretical lower bounds to assess the quality of the clusters identified by the algorithms, demonstrating that the NCP plot shape is an intrinsic property of large networks rather than an artifact of the algorithms or objective functions.