Learned Image Compression with Discretized Gaussian Mixture Likelihoods and Attention Modules

Learned Image Compression with Discretized Gaussian Mixture Likelihoods and Attention Modules

30 Mar 2020 | Zhengxue Cheng, Heming Sun, Masaru Takeuchi, Jiro Katto
This paper proposes a learned image compression method that combines discretized Gaussian mixture likelihoods and attention modules to achieve state-of-the-art performance. The method addresses the performance gap between learned compression algorithms and existing compression standards, particularly in terms of PSNR. The key contributions include the use of discretized Gaussian mixture likelihoods to parameterize the distributions of latent codes, which allows for more accurate and flexible entropy modeling. Additionally, the method incorporates attention modules to enhance performance by focusing on complex regions of images. Experimental results show that the proposed method achieves superior performance compared to existing learned compression methods and traditional standards like HEVC, JPEG2000, and JPEG. The method also achieves comparable PSNR performance with the latest compression standard VVC. The approach generates visually pleasing results when optimized by MS-SSIM. The method is evaluated on the Kodak and CLIC datasets, demonstrating its effectiveness in both low and high-resolution image compression. The network architecture includes residual blocks and simplified attention modules to improve coding efficiency. The method is trained using Adam optimizer with a learning rate of 1e-4 and evaluated using MSE and MS-SSIM metrics. The results show that the proposed method achieves significant improvements in rate-distortion performance, with the best performance achieved using the Gaussian mixture likelihoods and attention modules. The method is also validated on the VVC standard, showing that it can achieve comparable PSNR performance with VVC. The results demonstrate that the proposed method is a flexible and accurate entropy model that can be applied to a wide range of image compression tasks.This paper proposes a learned image compression method that combines discretized Gaussian mixture likelihoods and attention modules to achieve state-of-the-art performance. The method addresses the performance gap between learned compression algorithms and existing compression standards, particularly in terms of PSNR. The key contributions include the use of discretized Gaussian mixture likelihoods to parameterize the distributions of latent codes, which allows for more accurate and flexible entropy modeling. Additionally, the method incorporates attention modules to enhance performance by focusing on complex regions of images. Experimental results show that the proposed method achieves superior performance compared to existing learned compression methods and traditional standards like HEVC, JPEG2000, and JPEG. The method also achieves comparable PSNR performance with the latest compression standard VVC. The approach generates visually pleasing results when optimized by MS-SSIM. The method is evaluated on the Kodak and CLIC datasets, demonstrating its effectiveness in both low and high-resolution image compression. The network architecture includes residual blocks and simplified attention modules to improve coding efficiency. The method is trained using Adam optimizer with a learning rate of 1e-4 and evaluated using MSE and MS-SSIM metrics. The results show that the proposed method achieves significant improvements in rate-distortion performance, with the best performance achieved using the Gaussian mixture likelihoods and attention modules. The method is also validated on the VVC standard, showing that it can achieve comparable PSNR performance with VVC. The results demonstrate that the proposed method is a flexible and accurate entropy model that can be applied to a wide range of image compression tasks.
Reach us at info@study.space