Diffusion Model-Based Image Editing: A Survey

Diffusion Model-Based Image Editing: A Survey

16 Mar 2024 | Yi Huang*, Jiancheng Huang*, Yifan Liu*, Mingfu Yan*, Jiaxi Lv*, Jianzhuang Liu*, Senior Member, IEEE, Wei Xiong, He Zhang, Shifeng Chen, and Liangliang Cao, Senior Member, IEEE
This survey provides an extensive overview of diffusion model-based image editing, covering both theoretical and practical aspects. It delves into the methodologies, input conditions, and a wide range of editing tasks achieved by diffusion models. The survey categorizes over 100 research papers into three primary classes based on learning strategies: training-based approaches, testing-time fine-tuning approaches, and training and finetuning free approaches. It also explores 10 distinct types of input conditions and 12 specific editing tasks, including semantic, stylistic, and structural editing. The paper highlights the advancements in image editing using diffusion models, such as text-conditioned editing, classifier-free guidance, and multimodal conditional methods. Additionally, it introduces EditEval, a benchmark for evaluating text-guided image editing algorithms, and proposes the LMM Score metric. The survey addresses current limitations and suggests future research directions, aiming to provide a comprehensive resource for the field.This survey provides an extensive overview of diffusion model-based image editing, covering both theoretical and practical aspects. It delves into the methodologies, input conditions, and a wide range of editing tasks achieved by diffusion models. The survey categorizes over 100 research papers into three primary classes based on learning strategies: training-based approaches, testing-time fine-tuning approaches, and training and finetuning free approaches. It also explores 10 distinct types of input conditions and 12 specific editing tasks, including semantic, stylistic, and structural editing. The paper highlights the advancements in image editing using diffusion models, such as text-conditioned editing, classifier-free guidance, and multimodal conditional methods. Additionally, it introduces EditEval, a benchmark for evaluating text-guided image editing algorithms, and proposes the LMM Score metric. The survey addresses current limitations and suggests future research directions, aiming to provide a comprehensive resource for the field.
Reach us at info@study.space