Unsupervised Learning

Unsupervised Learning

2018 | J. Webster (ed.)
This article reviews traditional and current methods of classification in unsupervised learning, focusing on cluster analysis and self-organizing neural networks. Both methods aim to minimize the distance between input vectors and their representations without assuming a predefined cluster structure. Cluster analysis includes hard clustering, hierarchical, nonhierarchical, fuzzy clustering, and mixture clustering. Self-organizing maps are artificial neural networks that learn to represent input data patterns through unsupervised learning. Cluster analysis methods include hard clustering, which assigns units to clusters based on similarity, and fuzzy clustering, where units can belong to multiple clusters. Hierarchical clustering produces a sequence of partitions, while nonhierarchical clustering directly divides data into clusters. Distance measures like Euclidean, Manhattan, and Mahalanobis are used to calculate similarities between units. Hierarchical clustering methods include agglomerative and divisive approaches, with results visualized using dendrograms. Nonhierarchical clustering methods like c-means and c-medoids are used for partitioning data into clusters. Fuzzy clustering methods, such as Fuzzy c-Means (FcM), allow units to belong to multiple clusters with varying degrees of membership. The FcM method minimizes the sum of squared error, with parameters controlling the fuzziness of the partition. Cluster validity criteria, such as the Calinski-Harabasz and Silhouette criteria, are used to determine the optimal number of clusters. Fuzzy c-medoids (FcMd) is a variant of FcM that uses medoids instead of centroids for cluster representation. Other fuzzy clustering methods include Possibilistic Clustering, which allows for more flexible membership degrees, and Robust Fuzzy Clustering, which handles noise and outliers. Techniques like Fuzzy Clustering with Noise Cluster, Fuzzy Clustering with Exponential Distance, and Trimmed Fuzzy Clustering are designed to improve robustness in the presence of noise. Additionally, methods for nonstandard data, such as fuzzy data, symbolic data, and interval data, are reviewed, along with clustering approaches for complex data structures like text, time, and spatial data. The article also discusses various clustering strategies, including co-clustering, comparison clustering, and consensus clustering, and highlights software implementations in R for these methods.This article reviews traditional and current methods of classification in unsupervised learning, focusing on cluster analysis and self-organizing neural networks. Both methods aim to minimize the distance between input vectors and their representations without assuming a predefined cluster structure. Cluster analysis includes hard clustering, hierarchical, nonhierarchical, fuzzy clustering, and mixture clustering. Self-organizing maps are artificial neural networks that learn to represent input data patterns through unsupervised learning. Cluster analysis methods include hard clustering, which assigns units to clusters based on similarity, and fuzzy clustering, where units can belong to multiple clusters. Hierarchical clustering produces a sequence of partitions, while nonhierarchical clustering directly divides data into clusters. Distance measures like Euclidean, Manhattan, and Mahalanobis are used to calculate similarities between units. Hierarchical clustering methods include agglomerative and divisive approaches, with results visualized using dendrograms. Nonhierarchical clustering methods like c-means and c-medoids are used for partitioning data into clusters. Fuzzy clustering methods, such as Fuzzy c-Means (FcM), allow units to belong to multiple clusters with varying degrees of membership. The FcM method minimizes the sum of squared error, with parameters controlling the fuzziness of the partition. Cluster validity criteria, such as the Calinski-Harabasz and Silhouette criteria, are used to determine the optimal number of clusters. Fuzzy c-medoids (FcMd) is a variant of FcM that uses medoids instead of centroids for cluster representation. Other fuzzy clustering methods include Possibilistic Clustering, which allows for more flexible membership degrees, and Robust Fuzzy Clustering, which handles noise and outliers. Techniques like Fuzzy Clustering with Noise Cluster, Fuzzy Clustering with Exponential Distance, and Trimmed Fuzzy Clustering are designed to improve robustness in the presence of noise. Additionally, methods for nonstandard data, such as fuzzy data, symbolic data, and interval data, are reviewed, along with clustering approaches for complex data structures like text, time, and spatial data. The article also discusses various clustering strategies, including co-clustering, comparison clustering, and consensus clustering, and highlights software implementations in R for these methods.
Reach us at info@study.space
Understanding Unsupervised Learning