VOL. 44, NO. 6, OCTOBER 1998 | Andrew Barron, Member, IEEE, Jorma Rissanen, Senior Member, IEEE, and Bin Yu, Senior Member, IEEE
The paper reviews the Minimum Description Length (MDL) principle and Stochastic Complexity in data compression and statistical modeling. It formulates Stochastic Complexity as the solution to optimal universal coding problems, extending Shannon's source coding theorem. The normalized maximized likelihood, mixture, and predictive codings are shown to achieve Stochastic Complexity within asymptotically vanishing terms. The performance of the MDL criterion is assessed from both data compression and statistical inference perspectives. Context tree modeling, density estimation, and model selection in Gaussian linear regression are used as examples. The MDL principle is discussed in the context of parametric and nonparametric inference, emphasizing its utility in automating model selection based on data. The paper also explores the connection between statistical inference and predictive coding, and provides applications of the MDL principle to universal coding, linear regression, and density estimation.The paper reviews the Minimum Description Length (MDL) principle and Stochastic Complexity in data compression and statistical modeling. It formulates Stochastic Complexity as the solution to optimal universal coding problems, extending Shannon's source coding theorem. The normalized maximized likelihood, mixture, and predictive codings are shown to achieve Stochastic Complexity within asymptotically vanishing terms. The performance of the MDL criterion is assessed from both data compression and statistical inference perspectives. Context tree modeling, density estimation, and model selection in Gaussian linear regression are used as examples. The MDL principle is discussed in the context of parametric and nonparametric inference, emphasizing its utility in automating model selection based on data. The paper also explores the connection between statistical inference and predictive coding, and provides applications of the MDL principle to universal coding, linear regression, and density estimation.