LOSSY IMAGE COMPRESSION WITH COMPRESSIVE AUTOENCODERS

LOSSY IMAGE COMPRESSION WITH COMPRESSIVE AUTOENCODERS

1 Mar 2017 | Lucas Theis, Wenzhe Shi, Andrew Cunningham& Ferenc Huszár
This paper proposes a new approach to optimizing autoencoders for lossy image compression. The authors demonstrate that minimal changes to the loss function can train deep autoencoders that perform competitively with JPEG 2000 and outperform recent approaches based on recurrent neural networks (RNNs). Their network is computationally efficient due to a sub-pixel architecture, making it suitable for high-resolution images. This is in contrast to previous work on autoencoders for compression, which used coarser approximations, shallower architectures, or focused on small images. The key challenge in lossy compression is the non-differentiability of the compression loss, particularly due to quantization. The authors propose a simple but effective approach to handle this by approximating the non-differentiable cost of coding generated coefficients. They also introduce a method for estimating the entropy rate, which is crucial for efficient compression. The authors define a compressive autoencoder (CAE) with three components: an encoder, a decoder, and a probabilistic model. The encoder maps input images to a compressed representation, the decoder reconstructs the image from the compressed representation, and the probabilistic model assigns bits to representations based on their frequencies. The objective function balances the number of bits used and the distortion introduced by the compression. The authors address the non-differentiability of the rounding function by replacing its derivative with a smooth approximation. They also propose using stochastic rounding and additive noise as differentiable alternatives to rounding. These approaches allow the autoencoder to be trained effectively. The authors also introduce a method for variable bit rates, allowing for fine-grained control over the number of bits used. They use scale parameters to adjust the compression rate for different bit rates. The authors compare their approach to existing methods, including JPEG, JPEG 2000, and RNN-based methods. They find that their approach performs similarly to or better than JPEG 2000 in terms of perceptual quality, and outperforms other methods in terms of structural similarity. They also find that their approach is more efficient and can be applied to a wide range of media formats. The authors conclude that their approach provides a flexible and efficient solution for lossy image compression, with the potential to adapt quickly to new tasks and environments. They also note that further research is needed to develop perceptually relevant metrics suitable for optimization.This paper proposes a new approach to optimizing autoencoders for lossy image compression. The authors demonstrate that minimal changes to the loss function can train deep autoencoders that perform competitively with JPEG 2000 and outperform recent approaches based on recurrent neural networks (RNNs). Their network is computationally efficient due to a sub-pixel architecture, making it suitable for high-resolution images. This is in contrast to previous work on autoencoders for compression, which used coarser approximations, shallower architectures, or focused on small images. The key challenge in lossy compression is the non-differentiability of the compression loss, particularly due to quantization. The authors propose a simple but effective approach to handle this by approximating the non-differentiable cost of coding generated coefficients. They also introduce a method for estimating the entropy rate, which is crucial for efficient compression. The authors define a compressive autoencoder (CAE) with three components: an encoder, a decoder, and a probabilistic model. The encoder maps input images to a compressed representation, the decoder reconstructs the image from the compressed representation, and the probabilistic model assigns bits to representations based on their frequencies. The objective function balances the number of bits used and the distortion introduced by the compression. The authors address the non-differentiability of the rounding function by replacing its derivative with a smooth approximation. They also propose using stochastic rounding and additive noise as differentiable alternatives to rounding. These approaches allow the autoencoder to be trained effectively. The authors also introduce a method for variable bit rates, allowing for fine-grained control over the number of bits used. They use scale parameters to adjust the compression rate for different bit rates. The authors compare their approach to existing methods, including JPEG, JPEG 2000, and RNN-based methods. They find that their approach performs similarly to or better than JPEG 2000 in terms of perceptual quality, and outperforms other methods in terms of structural similarity. They also find that their approach is more efficient and can be applied to a wide range of media formats. The authors conclude that their approach provides a flexible and efficient solution for lossy image compression, with the potential to adapt quickly to new tasks and environments. They also note that further research is needed to develop perceptually relevant metrics suitable for optimization.
Reach us at info@study.space
[slides] Lossy Image Compression with Compressive Autoencoders | StudySpace