Control Color: Multimodal Diffusion-based Interactive Image Colorization

Control Color: Multimodal Diffusion-based Interactive Image Colorization

16 Feb 2024 | Zhexin Liang, Zhaochen Li, Shangchen Zhou, Chongyi Li, Chen Change Loy
**Control Color (CtrlColor)** is a novel multi-modal colorization method that leverages the pre-trained Stable Diffusion (SD) model to achieve highly controllable and interactive image colorization. The method supports both unconditional and conditional colorization, including text prompts, strokes, and exemplar images, allowing for flexible and precise local color manipulation. Key contributions include: 1. **Multi-modal Colorization**: CtrlColor unifies various colorization tasks within a single framework, supporting prompt-based, stroke-based, and exemplar-based colorization. 2. **Color Overflow and Incorrect Color Handling**: The method addresses color overflow and incorrect coloring through self-attention guidance and content-guided deformable autoencoders. 3. **Stroke-Based Colorization**: CtrlColor introduces a novel approach to add user strokes to control local colorization, enabling precise and flexible modifications. **Methods**: - **Latent Diffusion Model (LDM)**: CtrlColor uses the LDM to generate colorized images, leveraging the rich priors in the latent space. - **ControlNet**: CtrlColor integrates ControlNet to control the generated content based on additional input conditions. - **Content-guided Deformable Autoencoder**: This module handles large color overflow and incorrect coloring by aligning generated colors with input textures. - **Streamlined Self-Attention Guidance**: This training-free guidance technique improves sample quality by blurring and re-predicting small color overflow areas. **Experiments**: - **Quantitative and Qualitative Comparisons**: CtrlColor outperforms state-of-the-art methods in terms of colorfulness, FID, and CLIP scores. - **User Study**: CtrlColor receives high satisfaction ratings from users, indicating superior performance in perceptual realism, color richness, and aesthetic sense. **Conclusion**: CtrlColor provides a versatile and effective solution for highly controllable image colorization, achieving state-of-the-art performance in color richness, stability, and visual quality.**Control Color (CtrlColor)** is a novel multi-modal colorization method that leverages the pre-trained Stable Diffusion (SD) model to achieve highly controllable and interactive image colorization. The method supports both unconditional and conditional colorization, including text prompts, strokes, and exemplar images, allowing for flexible and precise local color manipulation. Key contributions include: 1. **Multi-modal Colorization**: CtrlColor unifies various colorization tasks within a single framework, supporting prompt-based, stroke-based, and exemplar-based colorization. 2. **Color Overflow and Incorrect Color Handling**: The method addresses color overflow and incorrect coloring through self-attention guidance and content-guided deformable autoencoders. 3. **Stroke-Based Colorization**: CtrlColor introduces a novel approach to add user strokes to control local colorization, enabling precise and flexible modifications. **Methods**: - **Latent Diffusion Model (LDM)**: CtrlColor uses the LDM to generate colorized images, leveraging the rich priors in the latent space. - **ControlNet**: CtrlColor integrates ControlNet to control the generated content based on additional input conditions. - **Content-guided Deformable Autoencoder**: This module handles large color overflow and incorrect coloring by aligning generated colors with input textures. - **Streamlined Self-Attention Guidance**: This training-free guidance technique improves sample quality by blurring and re-predicting small color overflow areas. **Experiments**: - **Quantitative and Qualitative Comparisons**: CtrlColor outperforms state-of-the-art methods in terms of colorfulness, FID, and CLIP scores. - **User Study**: CtrlColor receives high satisfaction ratings from users, indicating superior performance in perceptual realism, color richness, and aesthetic sense. **Conclusion**: CtrlColor provides a versatile and effective solution for highly controllable image colorization, achieving state-of-the-art performance in color richness, stability, and visual quality.
Reach us at info@study.space
[slides] Control Color%3A Multimodal Diffusion-based Interactive Image Colorization | StudySpace