Arbitrary-Scale Image Generation and Upsampling using Latent Diffusion Model and Implicit Neural Decoder

Arbitrary-Scale Image Generation and Upsampling using Latent Diffusion Model and Implicit Neural Decoder

15 Mar 2024 | Jinseok Kim, Tae-Kyun Kim
This paper proposes a method for arbitrary-scale image generation and upsampling using a latent diffusion model and an implicit neural decoder. The method combines a pre-trained auto-encoder, a latent diffusion model, and an implicit neural decoder to generate images at arbitrary scales with high fidelity, diversity, and fast inference speed. The proposed method uses a symmetric decoder without upsampling from the pre-trained auto-encoder and a Local Implicit Image Function (LIIF) in series. The latent diffusion process is learned using denoising and alignment losses jointly. Errors in output images are backpropagated via the fixed decoder, improving the quality of output images. The method outperforms existing methods in metrics of image quality, diversity, and scale consistency. It is significantly better than the relevant prior-art in the inference speed and memory usage. The method enables faster and more efficient image generation compared to other diffusion-based super-resolution models, as well as offers high fidelity and diverse output images. The method is evaluated on multiple public benchmarks for image super-resolution and novel image generation at arbitrary scales, showing competitive performance in both evaluation metrics. The method also demonstrates superior performance compared to other generative models despite using less training data. The method is also faster than other GAN-based SR models or regression models, but shows a fast inference speed compared to other diffusion-based models. The method is also more efficient in terms of inference speed and memory usage. The method is also more effective in generating diverse images with high perceptual quality and maintaining scale consistency. The method is also more effective in handling the 'ill-posed problem' by generating diverse images with high perceptual quality and maintaining scale consistency. The method is also more effective in generating images at arbitrary scales with high fidelity and diversity. The method is also more effective in generating images with high perceptual quality and maintaining scale consistency. The method is also more effective in generating images with high fidelity and diversity. The method is also more effective in generating images with high perceptual quality and maintaining scale consistency. The method is also more effective in generating images with high fidelity and diversity. The method is also more effective in generating images with high perceptual quality and maintaining scale consistency. The method is also more effective in generating images with high fidelity and diversity. The method is also more effective in generating images with high perceptual quality and maintaining scale consistency. The method is also more effective in generating images with high fidelity and diversity. The method is also more effective in generating images with high perceptual quality and maintaining scale consistency. The method is also more effective in generating images with high fidelity and diversity. The method is also more effective in generating images with high perceptual quality and maintaining scale consistency. The method is also more effective in generating images with high fidelity and diversity. The method is also more effective in generating images with high perceptual quality and maintaining scale consistency. The method is also more effective in generating images with high fidelity and diversity. The method is also more effective in generating imagesThis paper proposes a method for arbitrary-scale image generation and upsampling using a latent diffusion model and an implicit neural decoder. The method combines a pre-trained auto-encoder, a latent diffusion model, and an implicit neural decoder to generate images at arbitrary scales with high fidelity, diversity, and fast inference speed. The proposed method uses a symmetric decoder without upsampling from the pre-trained auto-encoder and a Local Implicit Image Function (LIIF) in series. The latent diffusion process is learned using denoising and alignment losses jointly. Errors in output images are backpropagated via the fixed decoder, improving the quality of output images. The method outperforms existing methods in metrics of image quality, diversity, and scale consistency. It is significantly better than the relevant prior-art in the inference speed and memory usage. The method enables faster and more efficient image generation compared to other diffusion-based super-resolution models, as well as offers high fidelity and diverse output images. The method is evaluated on multiple public benchmarks for image super-resolution and novel image generation at arbitrary scales, showing competitive performance in both evaluation metrics. The method also demonstrates superior performance compared to other generative models despite using less training data. The method is also faster than other GAN-based SR models or regression models, but shows a fast inference speed compared to other diffusion-based models. The method is also more efficient in terms of inference speed and memory usage. The method is also more effective in generating diverse images with high perceptual quality and maintaining scale consistency. The method is also more effective in handling the 'ill-posed problem' by generating diverse images with high perceptual quality and maintaining scale consistency. The method is also more effective in generating images at arbitrary scales with high fidelity and diversity. The method is also more effective in generating images with high perceptual quality and maintaining scale consistency. The method is also more effective in generating images with high fidelity and diversity. The method is also more effective in generating images with high perceptual quality and maintaining scale consistency. The method is also more effective in generating images with high fidelity and diversity. The method is also more effective in generating images with high perceptual quality and maintaining scale consistency. The method is also more effective in generating images with high fidelity and diversity. The method is also more effective in generating images with high perceptual quality and maintaining scale consistency. The method is also more effective in generating images with high fidelity and diversity. The method is also more effective in generating images with high perceptual quality and maintaining scale consistency. The method is also more effective in generating images with high fidelity and diversity. The method is also more effective in generating images with high perceptual quality and maintaining scale consistency. The method is also more effective in generating images with high fidelity and diversity. The method is also more effective in generating images with high perceptual quality and maintaining scale consistency. The method is also more effective in generating images with high fidelity and diversity. The method is also more effective in generating images
Reach us at info@study.space