[slides and audio] Unsupervised Learning

This article provides a comprehensive review of traditional and current methods in unsupervised learning, focusing on cluster analysis and self-organizing neural networks. Cluster analysis methods are categorized into hard clustering, fuzzy clustering, and mixture clustering. Hard clustering assigns units to clusters with exact membership degrees, while fuzzy clustering allows for membership degrees between 0 and 1. Mixture clustering fits a mixture model to data, identifying each cluster with one of its components. The article also discusses self-organizing maps, which are artificial neural networks that learn to represent input data through vector quantization. In the section on cluster analysis, the article covers hierarchical and nonhierarchical clustering methods. Hierarchical clustering methods, such as agglomerative and divisive clustering, group data iteratively, while nonhierarchical methods divide data into preselected clusters without hierarchical structure. Distance measures, including Euclidean, Manhattan, and Mahalanobis distances, are crucial for defining the proximity between units. The article also discusses cluster validity criteria, such as the Calinski-Harabasz and silhouette criteria, to evaluate the quality of clustering results. Fuzzy clustering methods, including Fuzzy c-Means (FcM) and its variants, are detailed. These methods allow for partial membership in clusters, making them more robust to noise and outliers. The article introduces FcM-NC, FcM-Exp, and Tr-FcM, which introduce noise clusters, exponential distances, and trimming procedures to handle noisy data. Additionally, the article covers other fuzzy clustering methods like Gustafson-Kessel, Fuzzy Shell, and kernel-based fuzzy clustering. The article concludes with a discussion on clustering nonstandard data, including fuzzy, symbolic, interval, categorical, text, time/space, three-way, sequence, functional, network, directional, and mixed data, emphasizing the importance of adapting clustering methods to the specific characteristics of the data.This article provides a comprehensive review of traditional and current methods in unsupervised learning, focusing on cluster analysis and self-organizing neural networks. Cluster analysis methods are categorized into hard clustering, fuzzy clustering, and mixture clustering. Hard clustering assigns units to clusters with exact membership degrees, while fuzzy clustering allows for membership degrees between 0 and 1. Mixture clustering fits a mixture model to data, identifying each cluster with one of its components. The article also discusses self-organizing maps, which are artificial neural networks that learn to represent input data through vector quantization. In the section on cluster analysis, the article covers hierarchical and nonhierarchical clustering methods. Hierarchical clustering methods, such as agglomerative and divisive clustering, group data iteratively, while nonhierarchical methods divide data into preselected clusters without hierarchical structure. Distance measures, including Euclidean, Manhattan, and Mahalanobis distances, are crucial for defining the proximity between units. The article also discusses cluster validity criteria, such as the Calinski-Harabasz and silhouette criteria, to evaluate the quality of clustering results. Fuzzy clustering methods, including Fuzzy c-Means (FcM) and its variants, are detailed. These methods allow for partial membership in clusters, making them more robust to noise and outliers. The article introduces FcM-NC, FcM-Exp, and Tr-FcM, which introduce noise clusters, exponential distances, and trimming procedures to handle noisy data. Additionally, the article covers other fuzzy clustering methods like Gustafson-Kessel, Fuzzy Shell, and kernel-based fuzzy clustering. The article concludes with a discussion on clustering nonstandard data, including fuzzy, symbolic, interval, categorical, text, time/space, three-way, sequence, functional, network, directional, and mixed data, emphasizing the importance of adapting clustering methods to the specific characteristics of the data.

UNSUPERVISED LEARNING

2018 | J. Webster (ed.)