Measuring Statistical Dependence with Hilbert-Schmidt Norms

Measuring Statistical Dependence with Hilbert-Schmidt Norms

June 2005 | Arthur Gretton, Olivier Bousquet, Alexander Smola, Bernhard Schölkopf
This paper introduces the Hilbert-Schmidt Independence Criterion (HSIC) as a method for measuring statistical dependence between random variables. HSIC is based on the eigenspectrum of covariance operators in reproducing kernel Hilbert spaces (RKHSs), and it measures dependence by computing the squared Hilbert-Schmidt norm of the cross-covariance operator. The HSIC is shown to be a robust and efficient measure of dependence, with several advantages over previous kernel-based independence criteria. It requires no user-defined regularisation, has a clearly defined population quantity that it converges to in the large sample limit, and is computationally efficient. The paper also shows that HSIC is competitive with other kernel-based criteria and ICA methods in experimental settings. The HSIC is defined as the squared HS-norm of the cross-covariance operator, and it is shown to be zero if and only if the random variables are independent. The paper also provides an empirical estimate of HSIC, which is shown to converge to the population HSIC at a rate of $1/\sqrt{m}$, where m is the sample size. The paper also provides a bound on the deviation between the empirical HSIC and its population counterpart, and shows that HSIC can be used as an independence test. The paper also describes an efficient approximation to the empirical HSIC based on the incomplete Cholesky decomposition. Finally, the paper applies HSIC to the problem of independent component analysis (ICA) and shows that it performs well in comparison to other methods.This paper introduces the Hilbert-Schmidt Independence Criterion (HSIC) as a method for measuring statistical dependence between random variables. HSIC is based on the eigenspectrum of covariance operators in reproducing kernel Hilbert spaces (RKHSs), and it measures dependence by computing the squared Hilbert-Schmidt norm of the cross-covariance operator. The HSIC is shown to be a robust and efficient measure of dependence, with several advantages over previous kernel-based independence criteria. It requires no user-defined regularisation, has a clearly defined population quantity that it converges to in the large sample limit, and is computationally efficient. The paper also shows that HSIC is competitive with other kernel-based criteria and ICA methods in experimental settings. The HSIC is defined as the squared HS-norm of the cross-covariance operator, and it is shown to be zero if and only if the random variables are independent. The paper also provides an empirical estimate of HSIC, which is shown to converge to the population HSIC at a rate of $1/\sqrt{m}$, where m is the sample size. The paper also provides a bound on the deviation between the empirical HSIC and its population counterpart, and shows that HSIC can be used as an independence test. The paper also describes an efficient approximation to the empirical HSIC based on the incomplete Cholesky decomposition. Finally, the paper applies HSIC to the problem of independent component analysis (ICA) and shows that it performs well in comparison to other methods.
Reach us at info@futurestudyspace.com
[slides] Measuring Statistical Dependence with Hilbert-Schmidt Norms | StudySpace