3 Mar 2017 | Johannes Ballé*, Valero Laparra, Eero P. Simoncelli*
The paper presents an end-to-end optimized image compression method that combines a nonlinear analysis transformation, a uniform quantizer, and a nonlinear synthesis transformation. The transforms are constructed using convolutional linear filters and nonlinear activation functions, with a focus on implementing local gain control inspired by biological neurons. The model is jointly optimized for rate-distortion performance using a variant of stochastic gradient descent, introducing a continuous proxy for the discontinuous loss function from quantization. The optimized model outperforms standard JPEG and JPEG 2000 methods in both rate-distortion performance and visual quality, as measured by MS-SSIM. The method leverages a generalized divisive normalization (GDN) transform, which is highly efficient in Gaussianizing the local joint statistics of natural images. The optimization process involves a continuous relaxation of the quantization step, allowing for the use of stochastic gradient descent. The paper also discusses the relationship between the proposed method and variational autoencoders, highlighting differences in their objectives and applications. Experimental results demonstrate the superior performance of the proposed method across various test images and bit rates.The paper presents an end-to-end optimized image compression method that combines a nonlinear analysis transformation, a uniform quantizer, and a nonlinear synthesis transformation. The transforms are constructed using convolutional linear filters and nonlinear activation functions, with a focus on implementing local gain control inspired by biological neurons. The model is jointly optimized for rate-distortion performance using a variant of stochastic gradient descent, introducing a continuous proxy for the discontinuous loss function from quantization. The optimized model outperforms standard JPEG and JPEG 2000 methods in both rate-distortion performance and visual quality, as measured by MS-SSIM. The method leverages a generalized divisive normalization (GDN) transform, which is highly efficient in Gaussianizing the local joint statistics of natural images. The optimization process involves a continuous relaxation of the quantization step, allowing for the use of stochastic gradient descent. The paper also discusses the relationship between the proposed method and variational autoencoders, highlighting differences in their objectives and applications. Experimental results demonstrate the superior performance of the proposed method across various test images and bit rates.