I Just Ran Two Million Regressions

I Just Ran Two Million Regressions

MAY 1997 | XAVIER X. SALA-I-MARTIN
Xavier Sala-i-Martin discusses the results of running over two million regressions to identify variables significantly correlated with economic growth. Building on Barro's (1991) work, he finds that many variables are significantly correlated with growth in cross-sectional regressions. However, empirical growth economists face the challenge of determining which variables are truly related to growth, as theories do not specify which variables should be included in the "true" regression. Levine and Renelt (1992) used an extreme-bounds test to identify robust variables, but found very few (or none) to be robust. This is because the test is too stringent, and many variables have coefficients that change sign or become insignificant in different regressions. Sala-i-Martin proposes a different approach, assigning a level of confidence to each variable based on the distribution of estimated coefficients. He uses two assumptions: first, that the distribution of coefficients is normal, and second, that it is not. Under both assumptions, he calculates the cumulative distribution function (CDF) of the coefficients to determine their significance. He finds that 22 out of 59 variables have a weighted CDF(0) greater than 0.95, indicating strong correlation with growth. These variables include regional, political, economic, and openness indicators. Sala-i-Martin concludes that the empirical growth literature shows that a substantial number of variables are strongly related to growth, contrary to the pessimistic view that nothing is robust. He also notes that some variables, such as measures of government spending and financial sophistication, do not significantly affect growth. The results are detailed in his 1996 paper.Xavier Sala-i-Martin discusses the results of running over two million regressions to identify variables significantly correlated with economic growth. Building on Barro's (1991) work, he finds that many variables are significantly correlated with growth in cross-sectional regressions. However, empirical growth economists face the challenge of determining which variables are truly related to growth, as theories do not specify which variables should be included in the "true" regression. Levine and Renelt (1992) used an extreme-bounds test to identify robust variables, but found very few (or none) to be robust. This is because the test is too stringent, and many variables have coefficients that change sign or become insignificant in different regressions. Sala-i-Martin proposes a different approach, assigning a level of confidence to each variable based on the distribution of estimated coefficients. He uses two assumptions: first, that the distribution of coefficients is normal, and second, that it is not. Under both assumptions, he calculates the cumulative distribution function (CDF) of the coefficients to determine their significance. He finds that 22 out of 59 variables have a weighted CDF(0) greater than 0.95, indicating strong correlation with growth. These variables include regional, political, economic, and openness indicators. Sala-i-Martin concludes that the empirical growth literature shows that a substantial number of variables are strongly related to growth, contrary to the pessimistic view that nothing is robust. He also notes that some variables, such as measures of government spending and financial sophistication, do not significantly affect growth. The results are detailed in his 1996 paper.
Reach us at info@futurestudyspace.com