2010 February 10 | Brian K. Lee, Justin Lessler, and Elizabeth A. Stuart
This study evaluates the performance of various machine learning methods for estimating propensity scores, comparing them to logistic regression. The authors simulated data under seven scenarios with varying degrees of non-linearity and non-additivity in the relationship between covariates and exposure. Propensity score weights were estimated using logistic regression, classification and regression trees (CART), pruned CART, bagged CART, random forests, and boosted CART. Performance metrics included covariate balance, standard error, bias, and 95% confidence interval coverage.
Logistic regression generally performed well under conditions of non-linearity or non-additivity alone, but showed poor performance when both were present. Ensemble methods, particularly boosted CART, provided better bias reduction and more consistent 95% CI coverage. These methods were more effective in handling non-linear and non-additive relationships between covariates and exposure.
The study found that ensemble methods, especially boosted CART, outperformed logistic regression in terms of covariate balance and effect estimation. Boosted CART consistently provided excellent performance across different sample sizes and scenarios. The results suggest that machine learning methods, particularly boosted CART, can be useful for propensity score weighting in observational studies. The study also highlights the importance of checking balance in interactions when true outcome models include interaction terms. Overall, the findings support the use of machine learning techniques for propensity score estimation, offering advantages over traditional logistic regression methods.This study evaluates the performance of various machine learning methods for estimating propensity scores, comparing them to logistic regression. The authors simulated data under seven scenarios with varying degrees of non-linearity and non-additivity in the relationship between covariates and exposure. Propensity score weights were estimated using logistic regression, classification and regression trees (CART), pruned CART, bagged CART, random forests, and boosted CART. Performance metrics included covariate balance, standard error, bias, and 95% confidence interval coverage.
Logistic regression generally performed well under conditions of non-linearity or non-additivity alone, but showed poor performance when both were present. Ensemble methods, particularly boosted CART, provided better bias reduction and more consistent 95% CI coverage. These methods were more effective in handling non-linear and non-additive relationships between covariates and exposure.
The study found that ensemble methods, especially boosted CART, outperformed logistic regression in terms of covariate balance and effect estimation. Boosted CART consistently provided excellent performance across different sample sizes and scenarios. The results suggest that machine learning methods, particularly boosted CART, can be useful for propensity score weighting in observational studies. The study also highlights the importance of checking balance in interactions when true outcome models include interaction terms. Overall, the findings support the use of machine learning techniques for propensity score estimation, offering advantages over traditional logistic regression methods.