GAIN: Missing Data Imputation using Generative Adversarial Nets

GAIN: Missing Data Imputation using Generative Adversarial Nets

7 Jun 2018 | Jinsung Yoon, James Jordon, Mihaela van der Schaar
The paper introduces a novel method called Generative Adversarial Imputation Nets (GAIN) for imputing missing data. GAIN adapts the Generative Adversarial Nets (GAN) framework, where the generator ($G$) imputes missing values based on observed components, and the discriminator ($D$) distinguishes between observed and imputed values. To ensure the generator learns the true data distribution, GAIN provides the discriminator with a hint vector that reveals partial information about the missingness. The paper provides theoretical results and compares GAIN to state-of-the-art imputation methods, demonstrating its superior performance on various datasets. Experiments show that GAIN outperforms methods like MICE, MissForest, and matrix completion, both in terms of imputation accuracy and prediction accuracy. The method is also robust to different missing rates, sample sizes, and feature dimensions.The paper introduces a novel method called Generative Adversarial Imputation Nets (GAIN) for imputing missing data. GAIN adapts the Generative Adversarial Nets (GAN) framework, where the generator ($G$) imputes missing values based on observed components, and the discriminator ($D$) distinguishes between observed and imputed values. To ensure the generator learns the true data distribution, GAIN provides the discriminator with a hint vector that reveals partial information about the missingness. The paper provides theoretical results and compares GAIN to state-of-the-art imputation methods, demonstrating its superior performance on various datasets. Experiments show that GAIN outperforms methods like MICE, MissForest, and matrix completion, both in terms of imputation accuracy and prediction accuracy. The method is also robust to different missing rates, sample sizes, and feature dimensions.
Reach us at info@study.space