SQ Lower Bounds for Non-Gaussian Component Analysis with Weaker Assumptions

SQ Lower Bounds for Non-Gaussian Component Analysis with Weaker Assumptions

March 8, 2024 | Ilias Diakonikolas, Daniel M. Kane, Lisheng Ren, Yuxin Sun
**Summary:** This paper presents a new lower bound for the complexity of Non-Gaussian Component Analysis (NGCA) in the Statistical Query (SQ) model. The authors show that the chi-squared distance condition, previously required for SQ hardness results, is not necessary. Instead, they establish a near-optimal SQ lower bound under only the moment-matching condition. This result generalizes to hidden subspaces and applies to various estimation tasks where existing methods provide sub-optimal or vacuous guarantees. The paper introduces the hypothesis testing version of NGCA, where the goal is to distinguish between a standard Gaussian and a distribution that behaves like a non-Gaussian distribution in a hidden subspace. The main result is a SQ lower bound that requires either a query with high accuracy or an exponential number of queries. This bound is derived using Fourier analysis and Hermite polynomials, and it applies to a wide range of statistical tasks, including list-decodable Gaussian mean estimation, anti-concentration detection, and learning periodic functions. The authors also discuss the implications of their results for other learning problems, showing that the SQ lower bound can be used to derive hardness results for tasks such as robust mean estimation, linear regression, and learning halfspaces. They highlight that the chi-squared distance condition is not essential for the SQ lower bounds and that their result improves upon previous work by removing this requirement. The paper concludes with a technical overview of the proof, which involves analyzing the expectation of query functions under the hidden subspace distribution and using Fourier decomposition to bound the difference between the Gaussian and the hidden subspace distribution. The authors also address technical challenges, such as handling unbounded chi-squared norms and ensuring the convergence of the Fourier series. Overall, the paper provides a significant advancement in understanding the SQ complexity of NGCA and its applications to other statistical tasks.**Summary:** This paper presents a new lower bound for the complexity of Non-Gaussian Component Analysis (NGCA) in the Statistical Query (SQ) model. The authors show that the chi-squared distance condition, previously required for SQ hardness results, is not necessary. Instead, they establish a near-optimal SQ lower bound under only the moment-matching condition. This result generalizes to hidden subspaces and applies to various estimation tasks where existing methods provide sub-optimal or vacuous guarantees. The paper introduces the hypothesis testing version of NGCA, where the goal is to distinguish between a standard Gaussian and a distribution that behaves like a non-Gaussian distribution in a hidden subspace. The main result is a SQ lower bound that requires either a query with high accuracy or an exponential number of queries. This bound is derived using Fourier analysis and Hermite polynomials, and it applies to a wide range of statistical tasks, including list-decodable Gaussian mean estimation, anti-concentration detection, and learning periodic functions. The authors also discuss the implications of their results for other learning problems, showing that the SQ lower bound can be used to derive hardness results for tasks such as robust mean estimation, linear regression, and learning halfspaces. They highlight that the chi-squared distance condition is not essential for the SQ lower bounds and that their result improves upon previous work by removing this requirement. The paper concludes with a technical overview of the proof, which involves analyzing the expectation of query functions under the hidden subspace distribution and using Fourier decomposition to bound the difference between the Gaussian and the hidden subspace distribution. The authors also address technical challenges, such as handling unbounded chi-squared norms and ensuring the convergence of the Fourier series. Overall, the paper provides a significant advancement in understanding the SQ complexity of NGCA and its applications to other statistical tasks.
Reach us at info@study.space
[slides and audio] SQ Lower Bounds for Non-Gaussian Component Analysis with Weaker Assumptions