Training Unbiased Diffusion Models from Biased Dataset

Training Unbiased Diffusion Models from Biased Dataset

2024 | Yeongmin Kim¹, Byeonghu Na¹, Minsang Park¹, JoonHo Jang¹, Dongjun Kim¹, Wanmo Kang¹, Il-Chul Moon¹,²
This paper proposes a method called Time-dependent Importance reWeighting (TIW) for training unbiased diffusion models from biased datasets. The method addresses the issue of dataset bias in diffusion models, which can lead to biased generated outputs. The key idea is to use a time-dependent density ratio to reweight samples and correct the score function, which helps in mitigating latent bias and improving sample quality. The time-dependent density ratio is estimated using a time-dependent discriminator, which allows for more accurate estimation compared to previous approaches. The method is theoretically connected to traditional score-matching and is shown to converge to an unbiased distribution. Experimental results on datasets such as CIFAR-10, CIFAR-100, FFHQ, and CelebA demonstrate that TIW outperforms existing methods, including time-independent importance reweighting, in various bias settings. The proposed method effectively reduces bias in generated samples and improves the fairness of the model. The code is available at the provided GitHub link.This paper proposes a method called Time-dependent Importance reWeighting (TIW) for training unbiased diffusion models from biased datasets. The method addresses the issue of dataset bias in diffusion models, which can lead to biased generated outputs. The key idea is to use a time-dependent density ratio to reweight samples and correct the score function, which helps in mitigating latent bias and improving sample quality. The time-dependent density ratio is estimated using a time-dependent discriminator, which allows for more accurate estimation compared to previous approaches. The method is theoretically connected to traditional score-matching and is shown to converge to an unbiased distribution. Experimental results on datasets such as CIFAR-10, CIFAR-100, FFHQ, and CelebA demonstrate that TIW outperforms existing methods, including time-independent importance reweighting, in various bias settings. The proposed method effectively reduces bias in generated samples and improves the fairness of the model. The code is available at the provided GitHub link.
Reach us at info@study.space
[slides and audio] Training Unbiased Diffusion Models From Biased Dataset