2008 | Achim Zeileis, Christian Kleiber, Simon Jackman
This paper introduces R functions for count data regression, focusing on the `hurdle()` and `zeroinfl()` functions from the `countreg` package. These functions extend classical models such as Poisson, geometric, and negative binomial regression to handle over-dispersion and excess zeros, which are common issues in economic and social science data. The paper reviews the conceptual and computational features of these models and their implementation in R. It demonstrates how to fit, inspect, and test these models using a microeconomic cross-section data set on the demand for medical care. The results show that hurdle and zero-inflated models provide better fits for the data compared to classical models, particularly in terms of capturing over-dispersion and zero counts. The paper also provides a detailed comparison of the fitted models, highlighting their similarities and differences in terms of mean functions and likelihoods.This paper introduces R functions for count data regression, focusing on the `hurdle()` and `zeroinfl()` functions from the `countreg` package. These functions extend classical models such as Poisson, geometric, and negative binomial regression to handle over-dispersion and excess zeros, which are common issues in economic and social science data. The paper reviews the conceptual and computational features of these models and their implementation in R. It demonstrates how to fit, inspect, and test these models using a microeconomic cross-section data set on the demand for medical care. The results show that hurdle and zero-inflated models provide better fits for the data compared to classical models, particularly in terms of capturing over-dispersion and zero counts. The paper also provides a detailed comparison of the fitted models, highlighting their similarities and differences in terms of mean functions and likelihoods.