Optimal score estimation via empirical Bayes smoothing

Optimal score estimation via empirical Bayes smoothing

12 Jun 2024 | Andre Wibisono, Yihong Wu, and Kaylee Yingxi Yang
This paper studies the problem of estimating the score function of an unknown probability distribution $\rho^*$ from $n$ independent and identically distributed observations in $d$ dimensions. Assuming $\rho^*$ is subgaussian and has a Lipschitz-continuous score function $s^*$, the authors establish the optimal rate of $\tilde{\Theta}(n^{-\frac{2}{d+4}})$ for this estimation problem under the loss function $\|\hat{s}-s^*\|_{L^{2}(\rho^*)}^{2}$, highlighting the curse of dimensionality where sample complexity for accurate score estimation grows exponentially with the dimension $d$. Leveraging insights from empirical Bayes theory and a new convergence rate of smoothed empirical distributions in Hellinger distance, the authors show that a regularized score estimator based on a Gaussian kernel attains this rate, which is shown to be optimal by a matching minimax lower bound. They also discuss extensions to estimating $\beta$-Hölder continuous scores with $\beta \leq 1$ and the implications of their theory on the sample complexity of score-based generative models. The paper provides a detailed analysis of the score estimation problem, including the derivation of an upper bound on the minimax risk and a matching lower bound, as well as applications to score-based generative models (SGMs). The results show that the optimal rate of score estimation is $n^{-\frac{2}{d+4}}$ up to logarithmic factors, and that the sample complexity of SGMs depends on the dimension $d$ and the desired accuracy of the score estimation. The paper also provides a detailed proof of the main results, including the analysis of the regularized score estimator and the application of empirical Bayes techniques to the problem of score estimation.This paper studies the problem of estimating the score function of an unknown probability distribution $\rho^*$ from $n$ independent and identically distributed observations in $d$ dimensions. Assuming $\rho^*$ is subgaussian and has a Lipschitz-continuous score function $s^*$, the authors establish the optimal rate of $\tilde{\Theta}(n^{-\frac{2}{d+4}})$ for this estimation problem under the loss function $\|\hat{s}-s^*\|_{L^{2}(\rho^*)}^{2}$, highlighting the curse of dimensionality where sample complexity for accurate score estimation grows exponentially with the dimension $d$. Leveraging insights from empirical Bayes theory and a new convergence rate of smoothed empirical distributions in Hellinger distance, the authors show that a regularized score estimator based on a Gaussian kernel attains this rate, which is shown to be optimal by a matching minimax lower bound. They also discuss extensions to estimating $\beta$-Hölder continuous scores with $\beta \leq 1$ and the implications of their theory on the sample complexity of score-based generative models. The paper provides a detailed analysis of the score estimation problem, including the derivation of an upper bound on the minimax risk and a matching lower bound, as well as applications to score-based generative models (SGMs). The results show that the optimal rate of score estimation is $n^{-\frac{2}{d+4}}$ up to logarithmic factors, and that the sample complexity of SGMs depends on the dimension $d$ and the desired accuracy of the score estimation. The paper also provides a detailed proof of the main results, including the analysis of the regularized score estimator and the application of empirical Bayes techniques to the problem of score estimation.
Reach us at info@study.space