[slides] A Classification EM algorithm for clustering and two stochastic versions

This paper introduces a Classification EM (CEM) algorithm for clustering and presents two stochastic versions of this algorithm to reduce the dependence on initial positions. The CEM algorithm is a generalization of the EM algorithm, designed to optimize Classification Maximum Likelihood (CML) criteria in mixture models. The authors derive two stochastic versions of the CEM algorithm: the SEM algorithm, which incorporates random perturbations similar to the EM algorithm, and the CAEM algorithm, which is a "simulated annealing" version of the CEM algorithm. Numerical experiments using the variance criterion show that both stochastic algorithms perform well compared to the standard k-means algorithm, which is a special case of the CEM algorithm. The paper also discusses the theoretical properties of the CEM algorithm and provides conditions for its convergence. The results suggest that the stochastic versions of the CEM algorithm are useful for finding stable and sensible solutions, especially for small sample sizes or when the data do not have a clear clustering structure.This paper introduces a Classification EM (CEM) algorithm for clustering and presents two stochastic versions of this algorithm to reduce the dependence on initial positions. The CEM algorithm is a generalization of the EM algorithm, designed to optimize Classification Maximum Likelihood (CML) criteria in mixture models. The authors derive two stochastic versions of the CEM algorithm: the SEM algorithm, which incorporates random perturbations similar to the EM algorithm, and the CAEM algorithm, which is a "simulated annealing" version of the CEM algorithm. Numerical experiments using the variance criterion show that both stochastic algorithms perform well compared to the standard k-means algorithm, which is a special case of the CEM algorithm. The paper also discusses the theoretical properties of the CEM algorithm and provides conditions for its convergence. The results suggest that the stochastic versions of the CEM algorithm are useful for finding stable and sensible solutions, especially for small sample sizes or when the data do not have a clear clustering structure.

A classification EM algorithm for clustering and two stochastic versions

Janvier 1991 | Gilles Celeux, Gérard Govaert