Learning Overcomplete Representations

Learning Overcomplete Representations

2000 | Michael S. Lewicki, Terrence J. Sejnowski
This paper presents an algorithm for learning overcomplete bases by viewing them as probabilistic models of observed data. Overcomplete representations, where the number of basis vectors exceeds the dimensionality of the input, offer advantages such as greater robustness to noise, sparsity, and flexibility in matching data structure. The authors show that overcomplete bases can yield better approximations of the underlying statistical distribution of the data, leading to more efficient coding. This approach generalizes independent component analysis (ICA) and provides a method for Bayesian reconstruction of signals in the presence of noise and for blind source separation when there are more sources than mixtures. The paper introduces a probabilistic model where each data vector is represented as a linear combination of basis functions plus additive noise. The goal is to find the most probable representation of the data by maximizing the posterior distribution of the coefficients. This formulation allows for the use of a prior distribution on the basis function coefficients, which removes redundancy and leads to sparse, nonlinear representations of the data. The algorithm is derived by maximizing the data likelihood over the basis functions, leading to a learning method that can handle arbitrary input noise levels and allows for objective comparison of different models. The paper also discusses the relationship between the prior distribution and the representation, showing how different priors can induce different representations of the data. For example, a Gaussian prior leads to a linear representation, while a Laplacian prior leads to a sparse, nonlinear representation. The authors demonstrate that overcomplete representations can provide more efficient and accurate representations of data, as they can better approximate the underlying statistical density of the input data. The paper presents examples of the algorithm applied to two-dimensional data sets, showing how overcomplete bases can fit complex data distributions more effectively than complete bases. The algorithm is also applied to speech data, where it learns sparse representations that capture the underlying structure of the speech signal. The results show that overcomplete bases can achieve better coding efficiency than traditional representations such as the Fourier basis. The paper concludes by discussing the advantages of the probabilistic formulation of the basis inference problem, which allows for explicit assumptions about the prior distribution on the basis coefficients and provides a natural way to compare different models. The algorithm generalizes ICA to account for additive noise and allows for overcomplete bases, leading to more flexible and efficient representations of data. The approach also provides a method for denoising data and for blind source separation when there are more sources than mixtures.This paper presents an algorithm for learning overcomplete bases by viewing them as probabilistic models of observed data. Overcomplete representations, where the number of basis vectors exceeds the dimensionality of the input, offer advantages such as greater robustness to noise, sparsity, and flexibility in matching data structure. The authors show that overcomplete bases can yield better approximations of the underlying statistical distribution of the data, leading to more efficient coding. This approach generalizes independent component analysis (ICA) and provides a method for Bayesian reconstruction of signals in the presence of noise and for blind source separation when there are more sources than mixtures. The paper introduces a probabilistic model where each data vector is represented as a linear combination of basis functions plus additive noise. The goal is to find the most probable representation of the data by maximizing the posterior distribution of the coefficients. This formulation allows for the use of a prior distribution on the basis function coefficients, which removes redundancy and leads to sparse, nonlinear representations of the data. The algorithm is derived by maximizing the data likelihood over the basis functions, leading to a learning method that can handle arbitrary input noise levels and allows for objective comparison of different models. The paper also discusses the relationship between the prior distribution and the representation, showing how different priors can induce different representations of the data. For example, a Gaussian prior leads to a linear representation, while a Laplacian prior leads to a sparse, nonlinear representation. The authors demonstrate that overcomplete representations can provide more efficient and accurate representations of data, as they can better approximate the underlying statistical density of the input data. The paper presents examples of the algorithm applied to two-dimensional data sets, showing how overcomplete bases can fit complex data distributions more effectively than complete bases. The algorithm is also applied to speech data, where it learns sparse representations that capture the underlying structure of the speech signal. The results show that overcomplete bases can achieve better coding efficiency than traditional representations such as the Fourier basis. The paper concludes by discussing the advantages of the probabilistic formulation of the basis inference problem, which allows for explicit assumptions about the prior distribution on the basis coefficients and provides a natural way to compare different models. The algorithm generalizes ICA to account for additive noise and allows for overcomplete bases, leading to more flexible and efficient representations of data. The approach also provides a method for denoising data and for blind source separation when there are more sources than mixtures.
Reach us at info@study.space