5 Apr 2018 | BY SUSAN ATHEY JULIE TIBSHIRANI AND STEFAN WAGER
The paper introduces generalized random forests, a method for non-parametric statistical estimation based on random forests. This method can be used to estimate any quantity of interest identified by a set of local moment equations. Unlike classical kernel weighting functions, which are prone to the curse of dimensionality, generalized random forests use an adaptive weighting function derived from a forest to express heterogeneity in the quantity of interest. The authors propose a flexible and computationally efficient algorithm for growing generalized random forests, develop a large sample theory showing consistency and asymptotic Gaussianity of the estimates, and provide an estimator for their asymptotic variance to enable valid confidence intervals. The method is applied to three statistical tasks: non-parametric quantile regression, conditional average partial effect estimation, and heterogeneous treatment effect estimation via instrumental variables. A software implementation, grf, is available for R and C++. The paper also discusses related work and provides theoretical guarantees for the method's performance.The paper introduces generalized random forests, a method for non-parametric statistical estimation based on random forests. This method can be used to estimate any quantity of interest identified by a set of local moment equations. Unlike classical kernel weighting functions, which are prone to the curse of dimensionality, generalized random forests use an adaptive weighting function derived from a forest to express heterogeneity in the quantity of interest. The authors propose a flexible and computationally efficient algorithm for growing generalized random forests, develop a large sample theory showing consistency and asymptotic Gaussianity of the estimates, and provide an estimator for their asymptotic variance to enable valid confidence intervals. The method is applied to three statistical tasks: non-parametric quantile regression, conditional average partial effect estimation, and heterogeneous treatment effect estimation via instrumental variables. A software implementation, grf, is available for R and C++. The paper also discusses related work and provides theoretical guarantees for the method's performance.