BrushNet: A Plug-and-Play Image Inpainting Model with Decomposed Dual-Branch Diffusion

BrushNet: A Plug-and-Play Image Inpainting Model with Decomposed Dual-Branch Diffusion

11 Mar 2024 | Xuan Ju¹,², Xian Liu¹,², Xintao Wang¹*, Yuxuan Bian², Ying Shan¹, and Qiang Xu²*
BrushNet is a plug-and-play image inpainting model that introduces a dual-branch diffusion architecture to enhance the inpainting process. The model separates masked image features and noisy latent features into distinct branches, reducing the model's learning burden and enabling more precise integration of essential masked image information. BrushNet is designed to embed pixel-level masked image features into any pre-trained diffusion model, ensuring coherent and improved image inpainting results. Additionally, BrushData and BrushBench are introduced to facilitate segmentation-based inpainting training and performance evaluation. Experimental results show that BrushNet outperforms existing models across seven key metrics, including image quality, mask region preservation, and textual coherence. BrushNet addresses the limitations of previous inpainting methods by introducing an additional branch for masked image feature extraction, which allows for more effective and flexible inpainting. The model uses a VAE encoder to process masked images, enabling better feature extraction and alignment with the pre-trained UNet. It also employs a hierarchical approach to gradually incorporate the full UNet feature layer-by-layer, allowing for dense per-pixel control. Furthermore, BrushNet removes text cross-attention from the UNet to ensure pure image information is considered in the additional branch, enhancing the inpainting process. BrushNet is evaluated on two benchmark datasets: EditBench for random brush masks and BrushBench for segmentation-based masks. The results demonstrate that BrushNet achieves state-of-the-art performance across various inpainting tasks, including random masks, inside-inpainting masks, and outside-inpainting masks. The model's flexible control ability allows for adjustments in the preservation scale of the unmasked region, enabling precise and customizable inpainting. The model's dual-branch design and flexible control features make it suitable for seamless integration with various pre-trained diffusion models, offering a plug-and-play solution for image inpainting.BrushNet is a plug-and-play image inpainting model that introduces a dual-branch diffusion architecture to enhance the inpainting process. The model separates masked image features and noisy latent features into distinct branches, reducing the model's learning burden and enabling more precise integration of essential masked image information. BrushNet is designed to embed pixel-level masked image features into any pre-trained diffusion model, ensuring coherent and improved image inpainting results. Additionally, BrushData and BrushBench are introduced to facilitate segmentation-based inpainting training and performance evaluation. Experimental results show that BrushNet outperforms existing models across seven key metrics, including image quality, mask region preservation, and textual coherence. BrushNet addresses the limitations of previous inpainting methods by introducing an additional branch for masked image feature extraction, which allows for more effective and flexible inpainting. The model uses a VAE encoder to process masked images, enabling better feature extraction and alignment with the pre-trained UNet. It also employs a hierarchical approach to gradually incorporate the full UNet feature layer-by-layer, allowing for dense per-pixel control. Furthermore, BrushNet removes text cross-attention from the UNet to ensure pure image information is considered in the additional branch, enhancing the inpainting process. BrushNet is evaluated on two benchmark datasets: EditBench for random brush masks and BrushBench for segmentation-based masks. The results demonstrate that BrushNet achieves state-of-the-art performance across various inpainting tasks, including random masks, inside-inpainting masks, and outside-inpainting masks. The model's flexible control ability allows for adjustments in the preservation scale of the unmasked region, enabling precise and customizable inpainting. The model's dual-branch design and flexible control features make it suitable for seamless integration with various pre-trained diffusion models, offering a plug-and-play solution for image inpainting.
Reach us at info@study.space
[slides and audio] BrushNet%3A A Plug-and-Play Image Inpainting Model with Decomposed Dual-Branch Diffusion