5 Jan 2022 | Chenlin Meng, Yutong He, Yang Song, Jiaming Song, Jiajun Wu, Jun-Yan Zhu, Stefano Ermon
**SDEdit: Guided Image Synthesis and Editing with Stochastic Differential Equations**
**Authors:** Chenlin Meng, Yutong He, Yang Song, Jiaming Song, Jiajun Wu, Jun-Yan Zhu, Stefano Ermon
**Institution:** Stanford University, Carnegie Mellon University
**Abstract:**
Guided image synthesis enables users to create and edit photo-realistic images with minimal effort. The challenge lies in balancing *faithfulness* to user inputs (e.g., hand-drawn colored strokes) and *realism* of the synthesized images. Existing GAN-based methods often require additional training data or loss functions for individual applications. To address these issues, SDEdit introduces a new image synthesis and editing method based on a diffusion model generative prior. Given an input image with user guidance in the form of manipulating RGB pixels, SDEdit first adds noise to the input and then denoises the resulting image through the SDE prior to increase its realism. SDEdit does not require task-specific training or inversions and can naturally achieve a balance between realism and faithfulness. SDEdit outperforms state-of-the-art GAN-based methods by up to 98.09% on realism and 91.72% on overall satisfaction scores in multiple tasks, including stroke-based image synthesis and editing and image compositing.
**Key Contributions:**
- **SDEdit:** A new image synthesis and editing framework based on stochastic differential equations (SDEs).
- **Balanced Realism and Faithfulness:** SDEdit naturally balances realism and faithfulness without task-specific training or inversions.
- **Performance:** SDEdit outperforms state-of-the-art GAN-based methods by up to 98.09% on realism and 91.72% on overall satisfaction scores.
**Methods:**
- **SDEdit Procedure:** Given an input image with user guidance, SDEdit adds Gaussian noise to the input and then runs the reverse SDE to synthesize images.
- **Realism-Faithfulness Trade-off:** The quality of the synthesized image depends on the initialization time \( t_0 \), which balances realism and faithfulness.
**Experiments:**
- **Stroke-Based Image Synthesis and Editing:** SDEdit outperforms baselines on stroke-based image synthesis and editing tasks.
- **Image Compositing:** SDEdit achieves better faithfulness and outperforms baselines by up to 83.73% on overall satisfaction scores.
**Conclusion:**
SDEdit is a novel guided image synthesis and editing method that leverages SDEs to achieve balanced realism and faithfulness. It does not require task-specific training or inversions, making it suitable for various editing tasks. SDEdit demonstrates superior performance in multiple experiments, highlighting its effectiveness in generating realistic and faithful images.**SDEdit: Guided Image Synthesis and Editing with Stochastic Differential Equations**
**Authors:** Chenlin Meng, Yutong He, Yang Song, Jiaming Song, Jiajun Wu, Jun-Yan Zhu, Stefano Ermon
**Institution:** Stanford University, Carnegie Mellon University
**Abstract:**
Guided image synthesis enables users to create and edit photo-realistic images with minimal effort. The challenge lies in balancing *faithfulness* to user inputs (e.g., hand-drawn colored strokes) and *realism* of the synthesized images. Existing GAN-based methods often require additional training data or loss functions for individual applications. To address these issues, SDEdit introduces a new image synthesis and editing method based on a diffusion model generative prior. Given an input image with user guidance in the form of manipulating RGB pixels, SDEdit first adds noise to the input and then denoises the resulting image through the SDE prior to increase its realism. SDEdit does not require task-specific training or inversions and can naturally achieve a balance between realism and faithfulness. SDEdit outperforms state-of-the-art GAN-based methods by up to 98.09% on realism and 91.72% on overall satisfaction scores in multiple tasks, including stroke-based image synthesis and editing and image compositing.
**Key Contributions:**
- **SDEdit:** A new image synthesis and editing framework based on stochastic differential equations (SDEs).
- **Balanced Realism and Faithfulness:** SDEdit naturally balances realism and faithfulness without task-specific training or inversions.
- **Performance:** SDEdit outperforms state-of-the-art GAN-based methods by up to 98.09% on realism and 91.72% on overall satisfaction scores.
**Methods:**
- **SDEdit Procedure:** Given an input image with user guidance, SDEdit adds Gaussian noise to the input and then runs the reverse SDE to synthesize images.
- **Realism-Faithfulness Trade-off:** The quality of the synthesized image depends on the initialization time \( t_0 \), which balances realism and faithfulness.
**Experiments:**
- **Stroke-Based Image Synthesis and Editing:** SDEdit outperforms baselines on stroke-based image synthesis and editing tasks.
- **Image Compositing:** SDEdit achieves better faithfulness and outperforms baselines by up to 83.73% on overall satisfaction scores.
**Conclusion:**
SDEdit is a novel guided image synthesis and editing method that leverages SDEs to achieve balanced realism and faithfulness. It does not require task-specific training or inversions, making it suitable for various editing tasks. SDEdit demonstrates superior performance in multiple experiments, highlighting its effectiveness in generating realistic and faithful images.