18 Mar 2024 | Gaurav Parmar, Taesung Park, Srinivasa Narasimhan, Jun-Yan Zhu
This paper introduces a one-step image-to-image translation method applicable to both paired and unpaired settings. The method adapts a pre-trained text-conditional one-step diffusion model, such as SD-Turbo, to new domains and tasks via adversarial learning objectives. The key idea is to efficiently adapt a pre-trained text-conditional one-step diffusion model to new domains and tasks via adversarial learning objectives. The authors propose a new generator architecture that leverages SD-Turbo weights while preserving the input image structure. The method achieves visually appealing results comparable to existing conditional diffusion models, while reducing the number of inference steps to 1. The method can be trained without image pairs. The authors demonstrate that their model, CycleGAN-Turbo, outperforms existing GAN-based and diffusion-based methods for various scene translation tasks, such as day-to-night conversion and adding/removing weather effects like fog, snow, and rain. They also extend their method to paired settings, where their model, pix2pix-Turbo, is on par with recent works like ControlNet for Sketch2Photo and Edge2Image, but with a single-step inference. The work suggests that single-step diffusion models can serve as strong backbones for a range of GAN learning objectives. The code and models are available at https://github.com/GaParmar/img2img-turbo.This paper introduces a one-step image-to-image translation method applicable to both paired and unpaired settings. The method adapts a pre-trained text-conditional one-step diffusion model, such as SD-Turbo, to new domains and tasks via adversarial learning objectives. The key idea is to efficiently adapt a pre-trained text-conditional one-step diffusion model to new domains and tasks via adversarial learning objectives. The authors propose a new generator architecture that leverages SD-Turbo weights while preserving the input image structure. The method achieves visually appealing results comparable to existing conditional diffusion models, while reducing the number of inference steps to 1. The method can be trained without image pairs. The authors demonstrate that their model, CycleGAN-Turbo, outperforms existing GAN-based and diffusion-based methods for various scene translation tasks, such as day-to-night conversion and adding/removing weather effects like fog, snow, and rain. They also extend their method to paired settings, where their model, pix2pix-Turbo, is on par with recent works like ControlNet for Sketch2Photo and Edge2Image, but with a single-step inference. The work suggests that single-step diffusion models can serve as strong backbones for a range of GAN learning objectives. The code and models are available at https://github.com/GaParmar/img2img-turbo.