A Comprehensive Survey of Clustering Algorithms

A Comprehensive Survey of Clustering Algorithms

25 May 2015 / Revised: 18 July 2015 / Accepted: 31 July 2015 / Published online: 12 August 2015 | Dongkuan Xu, Yingjie Tian
This paper provides a comprehensive survey of clustering algorithms, starting with the basic definitions and procedures. It covers commonly used distance and similarity functions, evaluation indicators, and the standard process of clustering. The paper then delves into traditional and modern clustering algorithms, categorizing them into 9 and 10 groups, respectively, with detailed analyses of each category. Traditional algorithms include partition-based, hierarchy-based, fuzzy theory-based, distribution-based, density-based, graph theory-based, grid-based, fractal theory-based, and model-based methods. Modern algorithms cover kernel-based, ensemble, swarm intelligence, quantum theory, spectral graph theory, affinity propagation, density and distance-based, spatial data, data stream, and large-scale data clustering. Each category is discussed in terms of its core ideas, advantages, and disadvantages, along with time complexity. The paper aims to provide a systematic and clear overview of clustering methods, highlighting their practical value and significance in data analysis.This paper provides a comprehensive survey of clustering algorithms, starting with the basic definitions and procedures. It covers commonly used distance and similarity functions, evaluation indicators, and the standard process of clustering. The paper then delves into traditional and modern clustering algorithms, categorizing them into 9 and 10 groups, respectively, with detailed analyses of each category. Traditional algorithms include partition-based, hierarchy-based, fuzzy theory-based, distribution-based, density-based, graph theory-based, grid-based, fractal theory-based, and model-based methods. Modern algorithms cover kernel-based, ensemble, swarm intelligence, quantum theory, spectral graph theory, affinity propagation, density and distance-based, spatial data, data stream, and large-scale data clustering. Each category is discussed in terms of its core ideas, advantages, and disadvantages, along with time complexity. The paper aims to provide a systematic and clear overview of clustering methods, highlighting their practical value and significance in data analysis.
Reach us at info@study.space