October 23, 2018 | Pradeep Ravikumar†, Martin J. Wainwright†, Garvesh Raskutti†, Bin Yu†,‡
This paper studies the problem of estimating the covariance matrix $ \Sigma^{*} $ and the inverse covariance or concentration matrix $ \Theta^{*} = (\Sigma^{*})^{-1} $ of a high-dimensional random vector $ X \in \mathbb{R}^p $ based on i.i.d. observations. The focus is on estimating $ \Theta^{*} $ by minimizing an $ \ell_1 $-penalized log-determinant Bregman divergence. In the multivariate Gaussian case, this corresponds to $ \ell_1 $-penalized maximum likelihood, and the structure of $ \Theta^{*} $ is determined by the graph of a Gaussian Markov random field (GMRF). The paper analyzes the performance of this estimator under high-dimensional scaling, where the number of nodes $ p $, edges $ s $, and maximum node degree $ d $ grow with the sample size $ n $. Key quantities controlling the rates include the $ \ell_\infty $-operator norm of $ \Sigma^{*} $, the $ \ell_\infty $-operator norm of the submatrix $ \Gamma_{SS}^{*} $, and a mutual incoherence or irrepresentability measure on $ \Gamma^{*} $. The first result establishes consistency of the estimator $ \widehat{\Theta} $ in the elementwise $ \ell_\infty $-norm, leading to convergence rates in Frobenius and spectral norms. The second result shows that $ \widehat{\Theta} $ correctly specifies the zero pattern of $ \Theta^{*} $ with high probability. The paper also compares the log-determinant approach to the neighborhood-based Lasso method, showing that the log-determinant approach has similar dependence on $ \theta_{\min} $ but quadratic dependence on $ d $, while the Lasso has linear dependence on $ d $. The analysis is supported by simulations for various graphs and problem parameters.This paper studies the problem of estimating the covariance matrix $ \Sigma^{*} $ and the inverse covariance or concentration matrix $ \Theta^{*} = (\Sigma^{*})^{-1} $ of a high-dimensional random vector $ X \in \mathbb{R}^p $ based on i.i.d. observations. The focus is on estimating $ \Theta^{*} $ by minimizing an $ \ell_1 $-penalized log-determinant Bregman divergence. In the multivariate Gaussian case, this corresponds to $ \ell_1 $-penalized maximum likelihood, and the structure of $ \Theta^{*} $ is determined by the graph of a Gaussian Markov random field (GMRF). The paper analyzes the performance of this estimator under high-dimensional scaling, where the number of nodes $ p $, edges $ s $, and maximum node degree $ d $ grow with the sample size $ n $. Key quantities controlling the rates include the $ \ell_\infty $-operator norm of $ \Sigma^{*} $, the $ \ell_\infty $-operator norm of the submatrix $ \Gamma_{SS}^{*} $, and a mutual incoherence or irrepresentability measure on $ \Gamma^{*} $. The first result establishes consistency of the estimator $ \widehat{\Theta} $ in the elementwise $ \ell_\infty $-norm, leading to convergence rates in Frobenius and spectral norms. The second result shows that $ \widehat{\Theta} $ correctly specifies the zero pattern of $ \Theta^{*} $ with high probability. The paper also compares the log-determinant approach to the neighborhood-based Lasso method, showing that the log-determinant approach has similar dependence on $ \theta_{\min} $ but quadratic dependence on $ d $, while the Lasso has linear dependence on $ d $. The analysis is supported by simulations for various graphs and problem parameters.