17 Apr 2024 | Daniel Geng*, Inbum Park*, and Andrew Owens
The paper "Factorized Diffusion: Perceptual Illusions by Noise Decomposition" by Daniel Geng, Inbum Park, and Andrew Owens presents a method to control individual components of an image through diffusion model sampling. The authors decompose images into linear components, such as low and high spatial frequencies, grayscale and color, or motion blur, and condition these components on different text prompts. This allows for the creation of hybrid images that change appearance based on viewing distance, illumination conditions, or motion blur. The method works by denoising with a composite noise estimate, built from conditioned noise estimates for each component. The paper also explores the recovery of prior approaches to compositional generation and spatial control, and extends the method to generate hybrid images from real images by solving an inverse problem. The authors provide qualitative and quantitative evaluations, showing that their method produces high-quality hybrid images that are better than those generated by traditional methods. The paper discusses the limitations and ethical considerations of the method, particularly in the context of generating illusions that could potentially be used for misinformation.The paper "Factorized Diffusion: Perceptual Illusions by Noise Decomposition" by Daniel Geng, Inbum Park, and Andrew Owens presents a method to control individual components of an image through diffusion model sampling. The authors decompose images into linear components, such as low and high spatial frequencies, grayscale and color, or motion blur, and condition these components on different text prompts. This allows for the creation of hybrid images that change appearance based on viewing distance, illumination conditions, or motion blur. The method works by denoising with a composite noise estimate, built from conditioned noise estimates for each component. The paper also explores the recovery of prior approaches to compositional generation and spatial control, and extends the method to generate hybrid images from real images by solving an inverse problem. The authors provide qualitative and quantitative evaluations, showing that their method produces high-quality hybrid images that are better than those generated by traditional methods. The paper discusses the limitations and ethical considerations of the method, particularly in the context of generating illusions that could potentially be used for misinformation.