This paper examines the finite-sample behavior of two sets of estimators for estimating log models: generalized linear models (GLM) and least squares estimators for the log of the dependent variable. The authors analyze how these estimators perform under various data structures commonly encountered in health economics, including skewed distributions, heavy-tailed errors, and heteroscedasticity. They find that the choice of estimator can significantly affect empirical results, especially when the data generating process is not properly accounted for.
The paper discusses the implications of using OLS on the log scale versus GLM models. While OLS is the most commonly used method for analyzing such data, it can be biased in the presence of heteroscedasticity if not appropriately retransformed. GLM models, on the other hand, can yield imprecise estimates if the error term is heavy-tailed on the log scale. The authors propose a method for selecting the most appropriate estimator based on tests that are relatively easy to implement.
The paper also presents an empirical example using data from the National Health Interview Survey to illustrate the importance of choosing the correct estimator. The results show that the precision of the estimates can vary significantly depending on the estimator used. The authors conclude that the choice of estimator can have major implications for empirical results, and that the performance of different estimators depends on the specific data generating mechanism. They recommend using a combination of tests and diagnostics to select the most appropriate estimator for a given data set.This paper examines the finite-sample behavior of two sets of estimators for estimating log models: generalized linear models (GLM) and least squares estimators for the log of the dependent variable. The authors analyze how these estimators perform under various data structures commonly encountered in health economics, including skewed distributions, heavy-tailed errors, and heteroscedasticity. They find that the choice of estimator can significantly affect empirical results, especially when the data generating process is not properly accounted for.
The paper discusses the implications of using OLS on the log scale versus GLM models. While OLS is the most commonly used method for analyzing such data, it can be biased in the presence of heteroscedasticity if not appropriately retransformed. GLM models, on the other hand, can yield imprecise estimates if the error term is heavy-tailed on the log scale. The authors propose a method for selecting the most appropriate estimator based on tests that are relatively easy to implement.
The paper also presents an empirical example using data from the National Health Interview Survey to illustrate the importance of choosing the correct estimator. The results show that the precision of the estimates can vary significantly depending on the estimator used. The authors conclude that the choice of estimator can have major implications for empirical results, and that the performance of different estimators depends on the specific data generating mechanism. They recommend using a combination of tests and diagnostics to select the most appropriate estimator for a given data set.