[slides and audio] LightIt%3A Illumination Modeling and Control for Diffusion Models

**LightIt: Illumination Modeling and Control for Diffusion Models** **Authors:** Peter Kocsis, Julien Philip, Kalyan Sunkavalli, Matthias Nießner, Yannick Hold-Geoffroy **Institution:** Technical University of Munich, Adobe Research **Abstract:** LightIt is a method for explicit illumination control in image generation. Recent generative models lack lighting control, which is crucial for artistic aspects such as setting the overall mood or cinematic appearance. LightIt addresses this by conditioning the generation process on shading and normal maps. The method models lighting with single-bounce shading, including cast shadows. It first trains a shading estimation module to generate a dataset of real-world images and shading pairs. Then, a control network is trained using the estimated shading and normals as input. LightIt demonstrates high-quality image generation and lighting control in various scenes. Additionally, it uses the generated dataset to train an identity-preserving relighting model, conditioned on an image and a target shading. This method is the first to enable controllable, consistent lighting in image generation, performing on par with specialized relighting state-of-the-art methods. **Introduction:** Generative imaging has evolved significantly, with diffusion models achieving outstanding performance on large-scale real-image datasets. However, these models lack explicit lighting control, leading to inconsistent lighting in generated images. LightIt proposes a single-view shading estimation method to generate a paired image-shading dataset. Given a single input image, the model predicts a 3D density field, traces rays toward the light to obtain cast shadows, and predicts single-bounce shading maps. This method allows generating shading maps for arbitrary lighting directions from a single image. The dataset enables lighting control in image generation, conditioned on normals to guide geometry. **Method:** LightIt adds lighting control to a diffusion-based model. It develops a shading estimation method and generates a dataset of paired real images and shading maps. The dataset is used to train a control module for the diffusion model. The method also includes a relighting module conditioned on an input image and a target shading, utilizing the strong natural image prior of Stable Diffusion. **Experiments:** - **Image Synthesis:** LightIt produces consistent and coherent lighting across various text prompts, as shown in user studies and quantitative evaluations. - **Relighting:** The method achieves better generalization to real-world samples compared to methods trained on synthetic data, as demonstrated in user studies and quantitative evaluations. **Ablations:** - **Lighting Representation:** Direct shading, which includes cast shadows, is essential for generating realistic images. - **Normal Conditioning:** Normals improve the consistency of normal inference in shadow regions. **Conclusion:** LightIt provides explicit control over illumination in image generation, achieving high-quality results while maintaining user-defined lighting. This approach enhances the editability of diffusion-based generative imaging approaches.**LightIt: Illumination Modeling and Control for Diffusion Models** **Authors:** Peter Kocsis, Julien Philip, Kalyan Sunkavalli, Matthias Nießner, Yannick Hold-Geoffroy **Institution:** Technical University of Munich, Adobe Research **Abstract:** LightIt is a method for explicit illumination control in image generation. Recent generative models lack lighting control, which is crucial for artistic aspects such as setting the overall mood or cinematic appearance. LightIt addresses this by conditioning the generation process on shading and normal maps. The method models lighting with single-bounce shading, including cast shadows. It first trains a shading estimation module to generate a dataset of real-world images and shading pairs. Then, a control network is trained using the estimated shading and normals as input. LightIt demonstrates high-quality image generation and lighting control in various scenes. Additionally, it uses the generated dataset to train an identity-preserving relighting model, conditioned on an image and a target shading. This method is the first to enable controllable, consistent lighting in image generation, performing on par with specialized relighting state-of-the-art methods. **Introduction:** Generative imaging has evolved significantly, with diffusion models achieving outstanding performance on large-scale real-image datasets. However, these models lack explicit lighting control, leading to inconsistent lighting in generated images. LightIt proposes a single-view shading estimation method to generate a paired image-shading dataset. Given a single input image, the model predicts a 3D density field, traces rays toward the light to obtain cast shadows, and predicts single-bounce shading maps. This method allows generating shading maps for arbitrary lighting directions from a single image. The dataset enables lighting control in image generation, conditioned on normals to guide geometry. **Method:** LightIt adds lighting control to a diffusion-based model. It develops a shading estimation method and generates a dataset of paired real images and shading maps. The dataset is used to train a control module for the diffusion model. The method also includes a relighting module conditioned on an input image and a target shading, utilizing the strong natural image prior of Stable Diffusion. **Experiments:** - **Image Synthesis:** LightIt produces consistent and coherent lighting across various text prompts, as shown in user studies and quantitative evaluations. - **Relighting:** The method achieves better generalization to real-world samples compared to methods trained on synthetic data, as demonstrated in user studies and quantitative evaluations. **Ablations:** - **Lighting Representation:** Direct shading, which includes cast shadows, is essential for generating realistic images. - **Normal Conditioning:** Normals improve the consistency of normal inference in shadow regions. **Conclusion:** LightIt provides explicit control over illumination in image generation, achieving high-quality results while maintaining user-defined lighting. This approach enhances the editability of diffusion-based generative imaging approaches.

LightIt: Illumination Modeling and Control for Diffusion Models

25 Mar 2024 | Peter Kocsis1*, Julien Philip2, Kalyan Sunkavalli2, Matthias Nießner1, Yannick Hold-Geoffroy2