29 May 2024 | Rishubh Parihar*, Abhijnya Bhat*, Abhipsa Basu, Saswat Mallick, Jogendra Nath Kundu, R. Venkatesh Babu
This paper presents a method for debiasing diffusion models (DMs) without requiring additional reference data or model retraining. The key idea is to enforce generated images to follow a prescribed attribute distribution through a technique called Distribution Guidance. The method leverages the latent features of the denoising UNet, which contain rich demographic semantics, to guide the generation process. An Attribute Distribution Predictor (ADP) is trained to map these latent features to the distribution of attributes. ADP is trained using pseudo labels generated from existing attribute classifiers. The proposed method reduces bias across single/multiple attributes and outperforms the baseline for both unconditional and text-conditional diffusion models. Additionally, the method is used to train a fair attribute classifier by augmenting the training set with generated data. The method is evaluated on face generation and large text-to-image diffusion models like Stable Diffusion. The results show that the proposed method achieves fair generation with high quality, and it is particularly effective in mitigating biases in large text-to-image diffusion models. The method is also shown to be effective in generating imbalanced distributions and in balancing minority classes in attribute classification. The method is efficient and can be implemented with minimal training data and computational resources. The paper also discusses the limitations of the method and suggests future directions for research.This paper presents a method for debiasing diffusion models (DMs) without requiring additional reference data or model retraining. The key idea is to enforce generated images to follow a prescribed attribute distribution through a technique called Distribution Guidance. The method leverages the latent features of the denoising UNet, which contain rich demographic semantics, to guide the generation process. An Attribute Distribution Predictor (ADP) is trained to map these latent features to the distribution of attributes. ADP is trained using pseudo labels generated from existing attribute classifiers. The proposed method reduces bias across single/multiple attributes and outperforms the baseline for both unconditional and text-conditional diffusion models. Additionally, the method is used to train a fair attribute classifier by augmenting the training set with generated data. The method is evaluated on face generation and large text-to-image diffusion models like Stable Diffusion. The results show that the proposed method achieves fair generation with high quality, and it is particularly effective in mitigating biases in large text-to-image diffusion models. The method is also shown to be effective in generating imbalanced distributions and in balancing minority classes in attribute classification. The method is efficient and can be implemented with minimal training data and computational resources. The paper also discusses the limitations of the method and suggests future directions for research.