This paper introduces FactorVAE, a method for learning disentangled representations that outperforms β-VAE in terms of disentanglement while maintaining reconstruction quality. FactorVAE encourages the marginal distribution of representations to be factorial by adding a penalty term that minimizes the total correlation between latent dimensions. This penalty is estimated using a discriminator network, similar to the divergence minimization approach in GANs. The method is evaluated on various datasets, including 2D and 3D shapes, 3D faces, and 3D chairs, showing improved disentanglement compared to β-VAE. The paper also identifies weaknesses in the disentanglement metric proposed by Higgins et al. (2016) and introduces a new metric that is more robust and avoids the failure mode of the previous one. Additionally, the paper compares FactorVAE with InfoWGAN-GP, a variant of InfoGAN, and finds that FactorVAE achieves better disentanglement scores. The results demonstrate that FactorVAE provides a better trade-off between disentanglement and reconstruction quality, making it a more effective method for learning disentangled representations.This paper introduces FactorVAE, a method for learning disentangled representations that outperforms β-VAE in terms of disentanglement while maintaining reconstruction quality. FactorVAE encourages the marginal distribution of representations to be factorial by adding a penalty term that minimizes the total correlation between latent dimensions. This penalty is estimated using a discriminator network, similar to the divergence minimization approach in GANs. The method is evaluated on various datasets, including 2D and 3D shapes, 3D faces, and 3D chairs, showing improved disentanglement compared to β-VAE. The paper also identifies weaknesses in the disentanglement metric proposed by Higgins et al. (2016) and introduces a new metric that is more robust and avoids the failure mode of the previous one. Additionally, the paper compares FactorVAE with InfoWGAN-GP, a variant of InfoGAN, and finds that FactorVAE achieves better disentanglement scores. The results demonstrate that FactorVAE provides a better trade-off between disentanglement and reconstruction quality, making it a more effective method for learning disentangled representations.