RB-Modulation: Training-Free Personalization of Diffusion Models using Stochastic Optimal Control

RB-Modulation: Training-Free Personalization of Diffusion Models using Stochastic Optimal Control

27 May 2024 | Litu Rout, Yujia Chen, Nataniel Ruiz, Abhishek Kumar, Constantine Caramanis, Sanjay Shakkottai, Wen-Sheng Chu
RB-Modulation is a training-free method for personalizing diffusion models using stochastic optimal control. The method enables precise control over both content and style by incorporating a style descriptor into the terminal cost of a stochastic optimal controller. This approach allows for the generation of images that adhere to given text prompts without leaking content from the reference style image. The method also introduces an Attention Feature Aggregation (AFA) module that decouples content and style from the reference image, enabling effective content-style composition. RB-Modulation outperforms existing training-free methods in terms of human preference and prompt alignment metrics. Theoretical justification and empirical evidence support the effectiveness of RB-Modulation in achieving training-free personalization of diffusion models. The method is applicable to a wide range of tasks, including image stylization and content-style composition, and demonstrates superior performance compared to other approaches. The framework is designed to be plug-and-play, eliminating the need for training or fine-tuning diffusion models. RB-Modulation is a novel approach that leverages stochastic optimal control to achieve training-free personalization of diffusion models, offering a flexible and efficient solution for content and style personalization.RB-Modulation is a training-free method for personalizing diffusion models using stochastic optimal control. The method enables precise control over both content and style by incorporating a style descriptor into the terminal cost of a stochastic optimal controller. This approach allows for the generation of images that adhere to given text prompts without leaking content from the reference style image. The method also introduces an Attention Feature Aggregation (AFA) module that decouples content and style from the reference image, enabling effective content-style composition. RB-Modulation outperforms existing training-free methods in terms of human preference and prompt alignment metrics. Theoretical justification and empirical evidence support the effectiveness of RB-Modulation in achieving training-free personalization of diffusion models. The method is applicable to a wide range of tasks, including image stylization and content-style composition, and demonstrates superior performance compared to other approaches. The framework is designed to be plug-and-play, eliminating the need for training or fine-tuning diffusion models. RB-Modulation is a novel approach that leverages stochastic optimal control to achieve training-free personalization of diffusion models, offering a flexible and efficient solution for content and style personalization.
Reach us at info@study.space