2024 | Heli Ben-Hamu, Omri Puny, Itai Gat, Brian Karrer, Uriel Singer, Yaron Lipman
D-Flow is a framework for controlled generation using differentiation through the flow of diffusion and flow-matching (FM) models. The method allows for controlled generation by optimizing the source (noise) point in the generation process. The key observation is that differentiating through the generation process projects gradient onto the data manifold, implicitly injecting prior information into the optimization. This approach is validated on various tasks including image and audio inverse problems, and conditional molecule generation, achieving state-of-the-art performance. The framework is simple and effective, with the ability to generate high-quality results without extensive tuning. The method is based on optimizing an arbitrary cost function with respect to the source noise point, and it has been shown to work well in both linear and non-linear scenarios. Theoretical analysis supports the implicit regularization introduced by differentiating through the flow, which aligns with the data distribution properties of diffusion/flow models. The method is implemented using a variety of techniques, including gradient-based optimization and differentiable ODE solvers. The results demonstrate that D-Flow achieves superior performance across multiple domains, including image inpainting, audio inpainting, and molecule generation. The method is also efficient and can be applied to a wide range of tasks, making it a valuable tool for controlled generation.D-Flow is a framework for controlled generation using differentiation through the flow of diffusion and flow-matching (FM) models. The method allows for controlled generation by optimizing the source (noise) point in the generation process. The key observation is that differentiating through the generation process projects gradient onto the data manifold, implicitly injecting prior information into the optimization. This approach is validated on various tasks including image and audio inverse problems, and conditional molecule generation, achieving state-of-the-art performance. The framework is simple and effective, with the ability to generate high-quality results without extensive tuning. The method is based on optimizing an arbitrary cost function with respect to the source noise point, and it has been shown to work well in both linear and non-linear scenarios. Theoretical analysis supports the implicit regularization introduced by differentiating through the flow, which aligns with the data distribution properties of diffusion/flow models. The method is implemented using a variety of techniques, including gradient-based optimization and differentiable ODE solvers. The results demonstrate that D-Flow achieves superior performance across multiple domains, including image inpainting, audio inpainting, and molecule generation. The method is also efficient and can be applied to a wide range of tasks, making it a valuable tool for controlled generation.