5 Feb 2013 | Ehsan Elhamifar, Student Member, IEEE, and René Vidal, Senior Member, IEEE
This paper introduces and studies an algorithm called Sparse Subspace Clustering (SSC) for clustering data points that lie in a union of low-dimensional subspaces. The key idea behind SSC is that a sparse representation of a data point corresponds to selecting a few points from the same subspace. This motivates solving a sparse optimization program whose solution is used in a spectral clustering framework to infer the clustering of the data into subspaces. Since solving the sparse optimization program is generally NP-hard, the paper considers a convex relaxation using $\ell_1$-minimization and shows that under appropriate conditions on the arrangement of subspaces and data distribution, the proposed minimization program succeeds in recovering the desired sparse representations. The algorithm is efficient and can handle data points near the intersections of subspaces. It also directly deals with data nuisances such as noise, sparse outlying entries, and missing entries by incorporating the model of the data into the sparse optimization program. The effectiveness of the proposed algorithm is demonstrated through experiments on synthetic data and real-world problems of motion segmentation and face clustering.This paper introduces and studies an algorithm called Sparse Subspace Clustering (SSC) for clustering data points that lie in a union of low-dimensional subspaces. The key idea behind SSC is that a sparse representation of a data point corresponds to selecting a few points from the same subspace. This motivates solving a sparse optimization program whose solution is used in a spectral clustering framework to infer the clustering of the data into subspaces. Since solving the sparse optimization program is generally NP-hard, the paper considers a convex relaxation using $\ell_1$-minimization and shows that under appropriate conditions on the arrangement of subspaces and data distribution, the proposed minimization program succeeds in recovering the desired sparse representations. The algorithm is efficient and can handle data points near the intersections of subspaces. It also directly deals with data nuisances such as noise, sparse outlying entries, and missing entries by incorporating the model of the data into the sparse optimization program. The effectiveness of the proposed algorithm is demonstrated through experiments on synthetic data and real-world problems of motion segmentation and face clustering.