2009 | Daniel Aloise · Amit Deshpande · Pierre Hansen · Preyas Popat
The paper addresses the NP-hardness of Euclidean sum-of-squares clustering, a problem that aims to partition a set of entities into homogeneous and well-separated clusters by minimizing the sum of squared Euclidean distances from each entity to its cluster centroid. The authors provide a detailed analysis of a proof by Drineas et al. (2004) that claims to show the NP-hardness of this problem for $k=2$ and general dimension. They identify an error in the reduction from the minimum bisection problem, which invalidates the proof. Instead, they present an alternate short proof and another valid proof by reduction from the densest cut problem, demonstrating that the problem is indeed NP-hard. The paper also discusses the implications of these findings and references related works.The paper addresses the NP-hardness of Euclidean sum-of-squares clustering, a problem that aims to partition a set of entities into homogeneous and well-separated clusters by minimizing the sum of squared Euclidean distances from each entity to its cluster centroid. The authors provide a detailed analysis of a proof by Drineas et al. (2004) that claims to show the NP-hardness of this problem for $k=2$ and general dimension. They identify an error in the reduction from the minimum bisection problem, which invalidates the proof. Instead, they present an alternate short proof and another valid proof by reduction from the densest cut problem, demonstrating that the problem is indeed NP-hard. The paper also discusses the implications of these findings and references related works.