NOISE MAP GUIDANCE: INVERSION WITH SPATIAL CONTEXT FOR REAL IMAGE EDITING

NOISE MAP GUIDANCE: INVERSION WITH SPATIAL CONTEXT FOR REAL IMAGE EDITING

7 Feb 2024 | Hansam Cho, Jonghyun Lee, Seoung Bum Kim, Tae-Hyun Oh, Yonghyun Jeong
Noise Map Guidance (NMG) is an innovative inversion method designed for real-image editing, particularly in the context of text-guided diffusion models. Traditional methods often struggle with maintaining spatial context and achieving high-quality edits, especially when using classifier-free guidance (CFG). NMG addresses these challenges by leveraging noise maps, which inherently capture the spatial context of the input image, and conditioning the reverse process on both noise maps and text embeddings. This approach ensures that the reconstruction path remains aligned with the inversion trajectory, preserving the spatial context while maintaining editing quality. NMG is optimized to avoid per-timestep optimization, making it computationally efficient compared to methods like Null-text Inversion (NTI) and Negative-prompt Inversion (NPI). Experimental results demonstrate that NMG outperforms other methods in various editing tasks, including local and global edits, non-rigid editing, and stylization. NMG also shows robustness to variations in the DDIM inversion framework, such as those introduced by pix2pix-zero. The method's effectiveness is further validated through quantitative evaluations using metrics like CLIPScore and TIFA, as well as a user study that confirms NMG's ability to align with human perception of image quality. Overall, NMG represents a significant advancement in real-image editing, offering a balance between speed and quality, and demonstrating its versatility across different editing techniques.Noise Map Guidance (NMG) is an innovative inversion method designed for real-image editing, particularly in the context of text-guided diffusion models. Traditional methods often struggle with maintaining spatial context and achieving high-quality edits, especially when using classifier-free guidance (CFG). NMG addresses these challenges by leveraging noise maps, which inherently capture the spatial context of the input image, and conditioning the reverse process on both noise maps and text embeddings. This approach ensures that the reconstruction path remains aligned with the inversion trajectory, preserving the spatial context while maintaining editing quality. NMG is optimized to avoid per-timestep optimization, making it computationally efficient compared to methods like Null-text Inversion (NTI) and Negative-prompt Inversion (NPI). Experimental results demonstrate that NMG outperforms other methods in various editing tasks, including local and global edits, non-rigid editing, and stylization. NMG also shows robustness to variations in the DDIM inversion framework, such as those introduced by pix2pix-zero. The method's effectiveness is further validated through quantitative evaluations using metrics like CLIPScore and TIFA, as well as a user study that confirms NMG's ability to align with human perception of image quality. Overall, NMG represents a significant advancement in real-image editing, offering a balance between speed and quality, and demonstrating its versatility across different editing techniques.
Reach us at info@study.space