January 19, 2024 | Lateef Babatunde Amusa, Twinomurinzi Hossana
This study compares various missing data treatments in Partial Least Squares Structural Equation Modeling (PLS-SEM), focusing on listwise deletion, mean imputation, regression imputation, and Expectation Maximization (EM) imputation. The research uses Monte Carlo simulations based on two social science models: the European Customer Satisfaction Index (ECSI) and the Unified Theory of Acceptance and Use of Technology (UTAUT). The simulations evaluate the performance of these methods under different missing data mechanisms (MCAR, MAR, MNAR) and missing proportions (20%, 30%, 40%, 50%) across sample sizes of 300 and 1000.
The results show that regression imputation outperforms the other methods in recovering model parameters and estimating parameter precision. It consistently produces lower mean absolute error (MAE) values compared to listwise deletion and mean imputation, especially under MAR and MNAR mechanisms. EM imputation also performs well, with relatively stable MAE values across different missing proportions. However, listwise deletion and mean imputation are less effective, particularly under MNAR conditions, where they produce higher MAE values.
The study highlights the importance of using appropriate missing data treatments in PLS-SEM to ensure accurate model parameter estimation and reliable results. Regression imputation is recommended as a more effective alternative to listwise deletion and mean imputation, especially in scenarios with non-random missing data. The findings suggest that researchers should consider the missing data mechanism when selecting an imputation method to avoid biased results. The study also notes the limitations of current imputation methods in PLS-SEM, such as the lack of software implementations for multiple imputation techniques. Overall, the research contributes to the understanding of missing data handling in PLS-SEM and provides guidance for practitioners and researchers in selecting appropriate methods for their analyses.This study compares various missing data treatments in Partial Least Squares Structural Equation Modeling (PLS-SEM), focusing on listwise deletion, mean imputation, regression imputation, and Expectation Maximization (EM) imputation. The research uses Monte Carlo simulations based on two social science models: the European Customer Satisfaction Index (ECSI) and the Unified Theory of Acceptance and Use of Technology (UTAUT). The simulations evaluate the performance of these methods under different missing data mechanisms (MCAR, MAR, MNAR) and missing proportions (20%, 30%, 40%, 50%) across sample sizes of 300 and 1000.
The results show that regression imputation outperforms the other methods in recovering model parameters and estimating parameter precision. It consistently produces lower mean absolute error (MAE) values compared to listwise deletion and mean imputation, especially under MAR and MNAR mechanisms. EM imputation also performs well, with relatively stable MAE values across different missing proportions. However, listwise deletion and mean imputation are less effective, particularly under MNAR conditions, where they produce higher MAE values.
The study highlights the importance of using appropriate missing data treatments in PLS-SEM to ensure accurate model parameter estimation and reliable results. Regression imputation is recommended as a more effective alternative to listwise deletion and mean imputation, especially in scenarios with non-random missing data. The findings suggest that researchers should consider the missing data mechanism when selecting an imputation method to avoid biased results. The study also notes the limitations of current imputation methods in PLS-SEM, such as the lack of software implementations for multiple imputation techniques. Overall, the research contributes to the understanding of missing data handling in PLS-SEM and provides guidance for practitioners and researchers in selecting appropriate methods for their analyses.