The paper "On the Histogram as a Density Estimator: $L_2$ Theory" by David Freedman and Persi Diaconis explores the use of histograms as a method to estimate probability density functions. The authors define the empirical histogram for a set of independent random variables with a common density \( f \). The height of the histogram at a point \( x \) is given by the number of observations falling in the class interval containing \( x \), normalized by the total number of observations.
The main focus is on the mean squared difference between the empirical histogram \( H(x) \) and the true density \( f(x) \), denoted as \( \delta^2 \). The paper analyzes this discrepancy under specific assumptions about the smoothness of \( f \), including that \( f \) is in \( L_2 \) and has a continuous derivative \( f' \) that is also in \( L_2 \).
Key results include:
1. **Theorem (1.6)**: For a specific choice of cell width \( h \), the expected mean squared difference \( \delta^2 \) is minimized, with the optimal cell width given by \( \alpha k^{-1/3} + O(k^{-1/2}) \) and the minimum expected difference being \( \beta k^{-2/3} + O(k^{-1}) \).
2. **Theorem (1.7)**: Under weaker conditions, the optimal cell width remains \( \alpha k^{-1/3} + o(k^{-1/3}) \) and the minimum expected difference is \( \beta k^{-2/3} + o(k^{-2/3}) \).
The paper also discusses the bias term \( r(h) \) and provides conditions under which \( r(h) \) is of order \( h^2 \) or \( h^3 \). Examples are given to illustrate the behavior of the histogram as an estimator, showing that while it can be effective, it may not always achieve the optimal rate of convergence.
Finally, the authors provide a detailed proof of the main theorems, using techniques such as Taylor's theorem and properties of uniform integrability. The results highlight the importance of choosing the cell width carefully to minimize the mean squared difference, and they offer a theoretical foundation for the practical use of histograms in density estimation.The paper "On the Histogram as a Density Estimator: $L_2$ Theory" by David Freedman and Persi Diaconis explores the use of histograms as a method to estimate probability density functions. The authors define the empirical histogram for a set of independent random variables with a common density \( f \). The height of the histogram at a point \( x \) is given by the number of observations falling in the class interval containing \( x \), normalized by the total number of observations.
The main focus is on the mean squared difference between the empirical histogram \( H(x) \) and the true density \( f(x) \), denoted as \( \delta^2 \). The paper analyzes this discrepancy under specific assumptions about the smoothness of \( f \), including that \( f \) is in \( L_2 \) and has a continuous derivative \( f' \) that is also in \( L_2 \).
Key results include:
1. **Theorem (1.6)**: For a specific choice of cell width \( h \), the expected mean squared difference \( \delta^2 \) is minimized, with the optimal cell width given by \( \alpha k^{-1/3} + O(k^{-1/2}) \) and the minimum expected difference being \( \beta k^{-2/3} + O(k^{-1}) \).
2. **Theorem (1.7)**: Under weaker conditions, the optimal cell width remains \( \alpha k^{-1/3} + o(k^{-1/3}) \) and the minimum expected difference is \( \beta k^{-2/3} + o(k^{-2/3}) \).
The paper also discusses the bias term \( r(h) \) and provides conditions under which \( r(h) \) is of order \( h^2 \) or \( h^3 \). Examples are given to illustrate the behavior of the histogram as an estimator, showing that while it can be effective, it may not always achieve the optimal rate of convergence.
Finally, the authors provide a detailed proof of the main theorems, using techniques such as Taylor's theorem and properties of uniform integrability. The results highlight the importance of choosing the cell width carefully to minimize the mean squared difference, and they offer a theoretical foundation for the practical use of histograms in density estimation.