March 2002 | Marianne Bertrand, Esther Duflo, Sendhil Mullainathan
This paper examines the reliability of Difference-in-Difference (DD) estimates, which are commonly used to estimate causal relationships by comparing outcomes before and after an intervention or treatment. The authors focus on the bias in standard errors introduced by serial correlation, a common issue in DD studies. They find that standard errors are severely biased, with up to 45% of placebo laws showing significant effects at the 5% level. To address this issue, the authors propose three techniques: collapsing data to ignore time-series variation, allowing for an arbitrary covariance structure between time periods, and using randomization inference testing methods. The third technique, based on randomization inference, is particularly effective regardless of sample size. The paper also reviews existing DD papers and their approaches to addressing serial correlation, finding that few papers explicitly address this problem. The authors conclude by discussing the implications of their findings for the existing literature and suggesting that researchers should be cautious when interpreting DD estimates without proper accounting for serial correlation.This paper examines the reliability of Difference-in-Difference (DD) estimates, which are commonly used to estimate causal relationships by comparing outcomes before and after an intervention or treatment. The authors focus on the bias in standard errors introduced by serial correlation, a common issue in DD studies. They find that standard errors are severely biased, with up to 45% of placebo laws showing significant effects at the 5% level. To address this issue, the authors propose three techniques: collapsing data to ignore time-series variation, allowing for an arbitrary covariance structure between time periods, and using randomization inference testing methods. The third technique, based on randomization inference, is particularly effective regardless of sample size. The paper also reviews existing DD papers and their approaches to addressing serial correlation, finding that few papers explicitly address this problem. The authors conclude by discussing the implications of their findings for the existing literature and suggesting that researchers should be cautious when interpreting DD estimates without proper accounting for serial correlation.