January, 2002 | Jason W. Osborne and Elaine Waters
This article discusses four key assumptions of multiple regression that researchers should always test: normality of variables, linearity of relationships, reliability of measurement, and homoscedasticity. The authors emphasize the importance of checking these assumptions to ensure the validity of regression results and to avoid Type I and Type II errors.
First, variables should be normally distributed. Non-normal variables can distort relationships and significance tests. Researchers can test normality using visual inspection, skew, kurtosis, P-P plots, and Kolmogorov-Smirnov tests. Outliers can be identified through visual inspection or z-scores. Removing outliers can improve the accuracy of estimates, but transformations may be necessary if outliers are not removed.
Second, there should be a linear relationship between independent and dependent variables. Non-linear relationships can lead to under-estimation of the true relationship, increasing the risk of Type II errors. Residual plots and curvilinear components can help detect non-linearity.
Third, variables should be measured without error. Unreliable measurement can lead to under-estimation of relationships, increasing the risk of Type II errors. Correction for low reliability can improve the accuracy of effect sizes.
Fourth, homoscedasticity should be present, meaning the variance of errors should be the same across all levels of the independent variable. Heteroscedasticity can distort findings and increase the possibility of Type I errors. Visual inspection of residuals and formal tests can help detect heteroscedasticity.
The authors argue that checking these assumptions is crucial for accurate data analysis and that researchers should be familiar with non-parametric techniques when assumptions are not met.This article discusses four key assumptions of multiple regression that researchers should always test: normality of variables, linearity of relationships, reliability of measurement, and homoscedasticity. The authors emphasize the importance of checking these assumptions to ensure the validity of regression results and to avoid Type I and Type II errors.
First, variables should be normally distributed. Non-normal variables can distort relationships and significance tests. Researchers can test normality using visual inspection, skew, kurtosis, P-P plots, and Kolmogorov-Smirnov tests. Outliers can be identified through visual inspection or z-scores. Removing outliers can improve the accuracy of estimates, but transformations may be necessary if outliers are not removed.
Second, there should be a linear relationship between independent and dependent variables. Non-linear relationships can lead to under-estimation of the true relationship, increasing the risk of Type II errors. Residual plots and curvilinear components can help detect non-linearity.
Third, variables should be measured without error. Unreliable measurement can lead to under-estimation of relationships, increasing the risk of Type II errors. Correction for low reliability can improve the accuracy of effect sizes.
Fourth, homoscedasticity should be present, meaning the variance of errors should be the same across all levels of the independent variable. Heteroscedasticity can distort findings and increase the possibility of Type I errors. Visual inspection of residuals and formal tests can help detect heteroscedasticity.
The authors argue that checking these assumptions is crucial for accurate data analysis and that researchers should be familiar with non-parametric techniques when assumptions are not met.