March 5, 2025 | Khashayar Gatmiry, Jonathan Kelner, Holden Lee
This paper presents a new algorithm for learning mixtures of $k$ Gaussians with identity covariance in $\mathbb{R}^n$ to TV error $\varepsilon$, achieving quasi-polynomial time and sample complexity ($O(n^{\text{poly} \log ( \frac{1}{\varepsilon} )})$) under a minimum weight assumption. The results extend to continuous mixtures where the mixing distribution is supported on a union of $k$ balls of constant radius, including Gaussian convolutions of distributions on low-dimensional manifolds or sets with small covering numbers. The approach is analytic and relies on diffusion models, a modern paradigm for generative modeling. Unlike previous algebraic methods, this work provides theoretical guarantees for learning non-trivial distributions using diffusion models. The algorithm involves learning score functions for a sequence of decreasing noise levels, leveraging higher-order noise sensitivity bounds and piecewise polynomial regression. The proof techniques include higher-order noise sensitivity, higher-order smoothing, and the use of diffusion models for maintaining clusters. The results are significant as they offer the first sub-exponential complexity guarantees for learning generalized Gaussian mixtures, which are more challenging than discrete mixtures.This paper presents a new algorithm for learning mixtures of $k$ Gaussians with identity covariance in $\mathbb{R}^n$ to TV error $\varepsilon$, achieving quasi-polynomial time and sample complexity ($O(n^{\text{poly} \log ( \frac{1}{\varepsilon} )})$) under a minimum weight assumption. The results extend to continuous mixtures where the mixing distribution is supported on a union of $k$ balls of constant radius, including Gaussian convolutions of distributions on low-dimensional manifolds or sets with small covering numbers. The approach is analytic and relies on diffusion models, a modern paradigm for generative modeling. Unlike previous algebraic methods, this work provides theoretical guarantees for learning non-trivial distributions using diffusion models. The algorithm involves learning score functions for a sequence of decreasing noise levels, leveraging higher-order noise sensitivity bounds and piecewise polynomial regression. The proof techniques include higher-order noise sensitivity, higher-order smoothing, and the use of diffusion models for maintaining clusters. The results are significant as they offer the first sub-exponential complexity guarantees for learning generalized Gaussian mixtures, which are more challenging than discrete mixtures.