Unsupervised K-Means Clustering Algorithm

Unsupervised K-Means Clustering Algorithm

April 20, 2020 | KRISTINA P. SINAGA AND MIIN-SHEN YANG
This paper proposes a novel unsupervised k-means (U-k-means) clustering algorithm that automatically determines the optimal number of clusters without requiring initialization or parameter selection. The U-k-means algorithm integrates entropy-based penalty terms into the k-means objective function to enhance clustering performance and automatically adjust the number of clusters based on data structure. The algorithm is designed to be free of initializations and parameter selection, making it more robust and efficient compared to traditional k-means and other clustering methods. The U-k-means algorithm is tested on various synthetic and real-world datasets, including numerical and real data sets, and compared with existing methods such as k-means, X-means, robust EM, C-FS, and RL-FCM. Experimental results demonstrate that the U-k-means algorithm achieves high accuracy and effectiveness in clustering tasks, particularly in scenarios where the number of clusters is unknown. The algorithm is also computationally efficient, with a complexity of O(ncd), where n is the number of data points, c is the number of clusters, and d is the dimensionality of the data. The U-k-means algorithm is shown to outperform other methods in terms of accuracy and robustness, particularly in noisy environments and for complex data structures. The algorithm is also compared with other clustering methods in terms of computational efficiency, and it is found to be the fastest among the tested algorithms. The proposed U-k-means algorithm is a significant advancement in unsupervised clustering, offering a more efficient and effective solution for clustering tasks.This paper proposes a novel unsupervised k-means (U-k-means) clustering algorithm that automatically determines the optimal number of clusters without requiring initialization or parameter selection. The U-k-means algorithm integrates entropy-based penalty terms into the k-means objective function to enhance clustering performance and automatically adjust the number of clusters based on data structure. The algorithm is designed to be free of initializations and parameter selection, making it more robust and efficient compared to traditional k-means and other clustering methods. The U-k-means algorithm is tested on various synthetic and real-world datasets, including numerical and real data sets, and compared with existing methods such as k-means, X-means, robust EM, C-FS, and RL-FCM. Experimental results demonstrate that the U-k-means algorithm achieves high accuracy and effectiveness in clustering tasks, particularly in scenarios where the number of clusters is unknown. The algorithm is also computationally efficient, with a complexity of O(ncd), where n is the number of data points, c is the number of clusters, and d is the dimensionality of the data. The U-k-means algorithm is shown to outperform other methods in terms of accuracy and robustness, particularly in noisy environments and for complex data structures. The algorithm is also compared with other clustering methods in terms of computational efficiency, and it is found to be the fastest among the tested algorithms. The proposed U-k-means algorithm is a significant advancement in unsupervised clustering, offering a more efficient and effective solution for clustering tasks.
Reach us at info@study.space
Understanding Unsupervised K-Means Clustering Algorithm