1 May 2018 | Johannes Ballé*, David Minnen*, Saurabh Singh*, Sung Jin Hwang*, Nick Johnston*
The paper introduces an end-to-end trainable model for image compression based on variational autoencoders (VAEs). The model incorporates a hyperprior to capture spatial dependencies in the latent representation, which is a novel approach in the context of image compression using artificial neural networks (ANNs). Unlike existing autoencoder compression methods, this model jointly trains a complex prior with the underlying autoencoder. The hyperprior is related to side information, a concept common in modern image codecs but underutilized in ANN-based compression. The model is evaluated using the MS-SSIM index and peak signal-to-noise ratio (PSNR), showing state-of-the-art performance in both metrics. The paper also provides a qualitative comparison of models trained with different distortion metrics, demonstrating the effectiveness of the hyperprior in improving compression performance. The experimental setup and results are detailed, including the network architecture, training process, and evaluation metrics. The authors discuss the implications of their work and compare it with existing methods, highlighting the advantages of their approach in terms of compression quality and efficiency.The paper introduces an end-to-end trainable model for image compression based on variational autoencoders (VAEs). The model incorporates a hyperprior to capture spatial dependencies in the latent representation, which is a novel approach in the context of image compression using artificial neural networks (ANNs). Unlike existing autoencoder compression methods, this model jointly trains a complex prior with the underlying autoencoder. The hyperprior is related to side information, a concept common in modern image codecs but underutilized in ANN-based compression. The model is evaluated using the MS-SSIM index and peak signal-to-noise ratio (PSNR), showing state-of-the-art performance in both metrics. The paper also provides a qualitative comparison of models trained with different distortion metrics, demonstrating the effectiveness of the hyperprior in improving compression performance. The experimental setup and results are detailed, including the network architecture, training process, and evaluation metrics. The authors discuss the implications of their work and compare it with existing methods, highlighting the advantages of their approach in terms of compression quality and efficiency.