December 2005 | René Vidal, Member, IEEE, Yi Ma, Member, IEEE, and Shankar Sastry, Fellow, IEEE
Generalized Principal Component Analysis (GPCA) is an algebraic-geometric method for segmenting an unknown number of subspaces of unknown and varying dimensions from sample data points. The method represents subspaces using homogeneous polynomials whose degree equals the number of subspaces. The derivatives of these polynomials at data points provide normal vectors to the subspaces. When the number of subspaces is known, GPCA reduces subspace segmentation to classifying one point per subspace. Points are selected optimally by minimizing a distance function, automatically handling moderate noise. A basis for the complement of each subspace is recovered by applying standard PCA to the derivatives (normal vectors) at these points. GPCA extends to high-dimensional data and unknown numbers of subspaces. Experiments on low-dimensional data show that GPCA outperforms existing algebraic algorithms based on polynomial factorization and provides a good initialization for iterative techniques like K-subspaces and Expectation Maximization. Applications include face clustering, temporal video segmentation, and 3D motion segmentation from point correspondences in multiple affine views. GPCA uses the Veronese map as a natural embedding for data in a union of subspaces. The method involves polynomial fitting, differentiation, and division to identify subspaces and their normal vectors. GPCA is robust to noise and can handle arbitrary intersections of subspaces. The algorithm is non-iterative and provides a global solution to subspace segmentation.Generalized Principal Component Analysis (GPCA) is an algebraic-geometric method for segmenting an unknown number of subspaces of unknown and varying dimensions from sample data points. The method represents subspaces using homogeneous polynomials whose degree equals the number of subspaces. The derivatives of these polynomials at data points provide normal vectors to the subspaces. When the number of subspaces is known, GPCA reduces subspace segmentation to classifying one point per subspace. Points are selected optimally by minimizing a distance function, automatically handling moderate noise. A basis for the complement of each subspace is recovered by applying standard PCA to the derivatives (normal vectors) at these points. GPCA extends to high-dimensional data and unknown numbers of subspaces. Experiments on low-dimensional data show that GPCA outperforms existing algebraic algorithms based on polynomial factorization and provides a good initialization for iterative techniques like K-subspaces and Expectation Maximization. Applications include face clustering, temporal video segmentation, and 3D motion segmentation from point correspondences in multiple affine views. GPCA uses the Veronese map as a natural embedding for data in a union of subspaces. The method involves polynomial fitting, differentiation, and division to identify subspaces and their normal vectors. GPCA is robust to noise and can handle arbitrary intersections of subspaces. The algorithm is non-iterative and provides a global solution to subspace segmentation.