12 Jun 2024 | Andre Wibisono, Yihong Wu, Kaylee Yingxi Yang
The paper addresses the problem of estimating the score function of an unknown probability distribution $\rho^*$ from $n$ independent and identically distributed observations in $d$ dimensions. The authors assume that $\rho^*$ is subgaussian and has a Lipschitz-continuous score function $s^*$. They establish an optimal rate of $\Theta(n^{-d+\beta})$ for this estimation problem under the loss function $\|s - s^*\|_{L^2(\rho^*)}^2$, highlighting the curse of dimensionality where sample complexity grows exponentially with the dimension $d$. Leveraging insights from empirical Bayes theory and a new convergence rate of smoothed empirical distribution in Hellinger distance, they show that a regularized score estimator based on a Gaussian kernel achieves this rate, which is shown to be optimal by a matching minimax lower bound. The paper also discusses extensions to estimating $\beta$-Hölder continuous scores with $\beta \leq 1$ and the implications of these results on the sample complexity of score-based generative models (SGMs).The paper addresses the problem of estimating the score function of an unknown probability distribution $\rho^*$ from $n$ independent and identically distributed observations in $d$ dimensions. The authors assume that $\rho^*$ is subgaussian and has a Lipschitz-continuous score function $s^*$. They establish an optimal rate of $\Theta(n^{-d+\beta})$ for this estimation problem under the loss function $\|s - s^*\|_{L^2(\rho^*)}^2$, highlighting the curse of dimensionality where sample complexity grows exponentially with the dimension $d$. Leveraging insights from empirical Bayes theory and a new convergence rate of smoothed empirical distribution in Hellinger distance, they show that a regularized score estimator based on a Gaussian kernel achieves this rate, which is shown to be optimal by a matching minimax lower bound. The paper also discusses extensions to estimating $\beta$-Hölder continuous scores with $\beta \leq 1$ and the implications of these results on the sample complexity of score-based generative models (SGMs).