The prevention and handling of the missing data

The prevention and handling of the missing data

2013 May 64(5): 402-406 | Hyun Kang
The article by Hyun Kang from the Department of Anesthesiology and Pain Medicine at Chung-Ang University College of Medicine in Seoul, Korea, discusses the prevention and handling of missing data in research. Missing data can reduce statistical power, introduce bias, and affect the representativeness of samples, leading to invalid conclusions. The paper reviews the types of missing data, including Missing Completely at Random (MCAR), Missing At Random (MAR), and Missing Not At Random (MNAR), and their implications for study design and analysis. Key techniques for handling missing data are outlined, such as listwise deletion, pairwise deletion, mean substitution, regression imputation, last observation carried forward (LOCF), maximum likelihood methods (EM and multiple imputation), and sensitivity analysis. Listwise deletion is the most common approach but can introduce bias if the data are not MCAR. Pairwise deletion preserves more information but can produce biased estimates. Mean substitution and regression imputation are less ideal as they do not add new information and can lead to underestimation of errors. LOCF is simple but strongly assumes no change in outcomes, leading to biased estimates. Multiple imputation is recommended for its ability to incorporate variability and uncertainty, producing valid statistical inferences. The article concludes with recommendations for handling missing data, emphasizing the importance of maximizing data collection during study design and using sophisticated statistical techniques only after maximal efforts to reduce missing data have been made.The article by Hyun Kang from the Department of Anesthesiology and Pain Medicine at Chung-Ang University College of Medicine in Seoul, Korea, discusses the prevention and handling of missing data in research. Missing data can reduce statistical power, introduce bias, and affect the representativeness of samples, leading to invalid conclusions. The paper reviews the types of missing data, including Missing Completely at Random (MCAR), Missing At Random (MAR), and Missing Not At Random (MNAR), and their implications for study design and analysis. Key techniques for handling missing data are outlined, such as listwise deletion, pairwise deletion, mean substitution, regression imputation, last observation carried forward (LOCF), maximum likelihood methods (EM and multiple imputation), and sensitivity analysis. Listwise deletion is the most common approach but can introduce bias if the data are not MCAR. Pairwise deletion preserves more information but can produce biased estimates. Mean substitution and regression imputation are less ideal as they do not add new information and can lead to underestimation of errors. LOCF is simple but strongly assumes no change in outcomes, leading to biased estimates. Multiple imputation is recommended for its ability to incorporate variability and uncertainty, producing valid statistical inferences. The article concludes with recommendations for handling missing data, emphasizing the importance of maximizing data collection during study design and using sophisticated statistical techniques only after maximal efforts to reduce missing data have been made.
Reach us at info@study.space
[slides and audio] The prevention and handling of the missing data