Lossy Image Compression with Foundation Diffusion Models

Lossy Image Compression with Foundation Diffusion Models

12 Apr 2024 | Lucas Relic, Roberto Azevedo, Markus Gross, Christopher Schroers
This paper introduces a novel lossy image compression method that leverages foundation latent diffusion models to produce highly realistic and detailed reconstructions at low bitrates. The approach formulates the removal of quantization error as a denoising task, using diffusion to recover lost information in the transmitted image latent. By performing a subset of denoising steps, the method significantly reduces the computational cost compared to full diffusion generative processes. The proposed codec outperforms previous methods in quantitative realism metrics and is preferred by end users in a user study, even when other methods use twice the bitrate. The key contributions include a parameter estimation module that learns adaptive quantization parameters and the ideal number of denoising steps, enabling faithful and realistic reconstruction for a range of target bitrates with a single model. The method is evaluated on several datasets using objective metrics and a user study, demonstrating superior visual quality and perceptual preference.This paper introduces a novel lossy image compression method that leverages foundation latent diffusion models to produce highly realistic and detailed reconstructions at low bitrates. The approach formulates the removal of quantization error as a denoising task, using diffusion to recover lost information in the transmitted image latent. By performing a subset of denoising steps, the method significantly reduces the computational cost compared to full diffusion generative processes. The proposed codec outperforms previous methods in quantitative realism metrics and is preferred by end users in a user study, even when other methods use twice the bitrate. The key contributions include a parameter estimation module that learns adaptive quantization parameters and the ideal number of denoising steps, enabling faithful and realistic reconstruction for a range of target bitrates with a single model. The method is evaluated on several datasets using objective metrics and a user study, demonstrating superior visual quality and perceptual preference.
Reach us at info@study.space