10 Jul 2018 | Diederik P. Kingma*, Prafulla Dhariwal*
Glow is a novel generative flow model that utilizes invertible 1 × 1 convolutions. The paper introduces Glow as an improvement over existing flow-based models, particularly in terms of log-likelihood and efficient synthesis of large images. Key contributions include:
1. **Invertible 1 × 1 Convolution**: This operation replaces the fixed permutation used in previous models, providing a more flexible and efficient way to permute channel variables. The log-determinant of this operation is straightforward to compute, making it suitable for optimization.
2. **Multi-Scale Architecture**: Glow combines multiple layers of invertible transformations to achieve high-quality sampling and realistic image synthesis.
3. **ActNorm**: A scale and bias layer that normalizes activations per channel, similar to batch normalization but with data-dependent initialization, improving training stability.
4. **Affine Coupling Layers**: These layers allow for efficient and reversible transformations, making Glow suitable for both training and synthesis.
The paper demonstrates significant improvements in log-likelihood on standard benchmarks and shows that Glow can efficiently synthesize realistic images at high resolutions. Qualitative experiments on the CelebA-HQ dataset validate the model's ability to produce high-quality samples and perform semantic attribute manipulation. The code for Glow is available on GitHub.Glow is a novel generative flow model that utilizes invertible 1 × 1 convolutions. The paper introduces Glow as an improvement over existing flow-based models, particularly in terms of log-likelihood and efficient synthesis of large images. Key contributions include:
1. **Invertible 1 × 1 Convolution**: This operation replaces the fixed permutation used in previous models, providing a more flexible and efficient way to permute channel variables. The log-determinant of this operation is straightforward to compute, making it suitable for optimization.
2. **Multi-Scale Architecture**: Glow combines multiple layers of invertible transformations to achieve high-quality sampling and realistic image synthesis.
3. **ActNorm**: A scale and bias layer that normalizes activations per channel, similar to batch normalization but with data-dependent initialization, improving training stability.
4. **Affine Coupling Layers**: These layers allow for efficient and reversible transformations, making Glow suitable for both training and synthesis.
The paper demonstrates significant improvements in log-likelihood on standard benchmarks and shows that Glow can efficiently synthesize realistic images at high resolutions. Qualitative experiments on the CelebA-HQ dataset validate the model's ability to produce high-quality samples and perform semantic attribute manipulation. The code for Glow is available on GitHub.