Understanding Recursive partitioning for heterogeneous causal effects

This paper proposes methods for estimating heterogeneity in causal effects in both experimental and observational studies, and for conducting hypothesis tests about the differences in treatment effects across subsets of the population. The authors introduce a data-driven approach to partition the data into subpopulations that differ in their treatment effects, enabling the construction of valid confidence intervals for treatment effects even in samples with many covariates relative to the sample size. They propose an "honest" estimation approach, where one sample is used to construct the partition and another to estimate treatment effects for each subpopulation. This approach builds on regression tree methods, modified to optimize for goodness of fit in treatment effects and to account for honest estimation. The model selection criteria focus on improving the prediction of treatment effects conditional on covariates, while accounting for the change in variance of treatment effect estimates within each subpopulation. The authors address the challenge of not having ground truth for causal effects by proposing approaches to construct unbiased estimates of the mean squared error of the causal effect. Through a simulation study, they show that honest estimation can result in substantial improvements in the coverage of confidence intervals, achieving nominal coverage rates without sacrificing fitting treatment effects. The paper also discusses the theoretical and practical implications of their methods, including the trade-offs between sample size and bias reduction in honest estimation.This paper proposes methods for estimating heterogeneity in causal effects in both experimental and observational studies, and for conducting hypothesis tests about the differences in treatment effects across subsets of the population. The authors introduce a data-driven approach to partition the data into subpopulations that differ in their treatment effects, enabling the construction of valid confidence intervals for treatment effects even in samples with many covariates relative to the sample size. They propose an "honest" estimation approach, where one sample is used to construct the partition and another to estimate treatment effects for each subpopulation. This approach builds on regression tree methods, modified to optimize for goodness of fit in treatment effects and to account for honest estimation. The model selection criteria focus on improving the prediction of treatment effects conditional on covariates, while accounting for the change in variance of treatment effect estimates within each subpopulation. The authors address the challenge of not having ground truth for causal effects by proposing approaches to construct unbiased estimates of the mean squared error of the causal effect. Through a simulation study, they show that honest estimation can result in substantial improvements in the coverage of confidence intervals, achieving nominal coverage rates without sacrificing fitting treatment effects. The paper also discusses the theoretical and practical implications of their methods, including the trade-offs between sample size and bias reduction in honest estimation.

Recursive Partitioning for Heterogeneous Causal Effects

First Draft: October 2013, This Draft: December 2015 | Susan Athey, Guido W. Imbens