[slides and audio] Performance of modularity maximization in practical contexts.

Modularity maximization is a widely used technique for identifying modular structures in networks, but its behavior and accuracy in practical contexts are not well understood. This study characterizes the performance of modularity maximization in real-world networks. It clarifies the resolution limit phenomenon, showing that modularity maximization can fail to resolve small modules due to the random graph model assumption. The modularity function Q exhibits extreme degeneracies, with an exponential number of high-scoring solutions and no clear global maximum. The maximum modularity Q_max depends strongly on network size and number of modules. Real-world examples show that degenerate solutions can disagree on many partition properties, such as module composition and size distribution. These results imply that modularity maximization outputs should be interpreted cautiously. They also explain why heuristics often find high-modularity partitions and why different heuristics can disagree on modular structure. Mitigating these behaviors could involve combining information from degenerate solutions or using generative models. The resolution limit is a phenomenon where modularity maximization fails to resolve modules smaller than a certain size. This occurs because the random graph model assumption leads to an implicit preference for inter-module connections. The resolution limit is better understood as a consequence of this assumption, not an implicit preference on edge weights within modules. The resolution limit appears in ring networks because inter-module connections have constant weight, while the null model expects this weight to decrease with network size. Weighted networks with inter-module connectivity matching the null expectation can avoid the resolution limit. The resolution limit is a real problem for interpreting the optimal partition's composition. However, clever algorithms can circumvent it by recursively partitioning modules or using history of merges. Multi-scale modularity-based methods allow specifying a target resolution limit. The resolution limit is better explained as a systematic deviation between inter-module connectivity and the connectivity expected under the random-graph null model. Modularity maximization exhibits extreme degeneracies, with many high-modularity partitions that are structurally similar but not identical. This is due to the exponential number of ways to combine modular structures. As networks become more modular, the globally optimal partition becomes harder to find among suboptimal alternatives. The number of degenerate solutions grows combinatorially with the number of modular structures. The modularity function is strongly peaked in a relative sense, with a vanishingly small fraction of partitions being degenerate. Hierarchical networks exhibit at least as many degenerate solutions as simple modular networks. The modularity scores of alternative solutions can be even closer. The existence of extreme degeneracies in the modularity function does not depend on the network's detailed structure but on the presence of many groups with few intergroup connections. These groups are the "building blocks" used to construct high-modularity partitions. The modularity function is glassy, not strongly peaked around the optimal partition, which is precisely the case where modularity maximization performs best. Similar degeneracies are likely to occur in other module-identificationModularity maximization is a widely used technique for identifying modular structures in networks, but its behavior and accuracy in practical contexts are not well understood. This study characterizes the performance of modularity maximization in real-world networks. It clarifies the resolution limit phenomenon, showing that modularity maximization can fail to resolve small modules due to the random graph model assumption. The modularity function Q exhibits extreme degeneracies, with an exponential number of high-scoring solutions and no clear global maximum. The maximum modularity Q_max depends strongly on network size and number of modules. Real-world examples show that degenerate solutions can disagree on many partition properties, such as module composition and size distribution. These results imply that modularity maximization outputs should be interpreted cautiously. They also explain why heuristics often find high-modularity partitions and why different heuristics can disagree on modular structure. Mitigating these behaviors could involve combining information from degenerate solutions or using generative models. The resolution limit is a phenomenon where modularity maximization fails to resolve modules smaller than a certain size. This occurs because the random graph model assumption leads to an implicit preference for inter-module connections. The resolution limit is better understood as a consequence of this assumption, not an implicit preference on edge weights within modules. The resolution limit appears in ring networks because inter-module connections have constant weight, while the null model expects this weight to decrease with network size. Weighted networks with inter-module connectivity matching the null expectation can avoid the resolution limit. The resolution limit is a real problem for interpreting the optimal partition's composition. However, clever algorithms can circumvent it by recursively partitioning modules or using history of merges. Multi-scale modularity-based methods allow specifying a target resolution limit. The resolution limit is better explained as a systematic deviation between inter-module connectivity and the connectivity expected under the random-graph null model. Modularity maximization exhibits extreme degeneracies, with many high-modularity partitions that are structurally similar but not identical. This is due to the exponential number of ways to combine modular structures. As networks become more modular, the globally optimal partition becomes harder to find among suboptimal alternatives. The number of degenerate solutions grows combinatorially with the number of modular structures. The modularity function is strongly peaked in a relative sense, with a vanishingly small fraction of partitions being degenerate. Hierarchical networks exhibit at least as many degenerate solutions as simple modular networks. The modularity scores of alternative solutions can be even closer. The existence of extreme degeneracies in the modularity function does not depend on the network's detailed structure but on the presence of many groups with few intergroup connections. These groups are the "building blocks" used to construct high-modularity partitions. The modularity function is glassy, not strongly peaked around the optimal partition, which is precisely the case where modularity maximization performs best. Similar degeneracies are likely to occur in other module-identification

The performance of modularity maximization in practical contexts

1 Apr 2010 | Benjamin H. Good,1,2,* Yves-Alexandre de Montjoye,3,2,† and Aaron Clauset2,‡