DreamMatcher: Appearance Matching Self-Attention for Semantically-Consistent Text-to-Image Personalization

DreamMatcher: Appearance Matching Self-Attention for Semantically-Consistent Text-to-Image Personalization

23 Apr 2024 | Jisu Nam*, Heesu Kim, DongJae Lee, Siyoon Jin, Seungryong Kim†, Seunggyu Chang†
DreamMatcher is a plug-in method for text-to-image (T2I) personalization that enhances subject appearance while preserving target structure. It leverages semantic matching to align reference appearance with the target structure in the self-attention module of pre-trained T2I models. DreamMatcher replaces the target values with reference values aligned by semantic matching, while keeping the structure path unchanged to maintain the model's ability to generate diverse structures. A semantic-consistent masking strategy isolates the personalized concept from irrelevant regions, ensuring only the relevant parts of the reference are used. DreamMatcher is compatible with existing T2I models and shows significant improvements in complex scenarios. It outperforms existing tuning-free plug-in methods and even a learnable method in terms of performance. DreamMatcher is effective even in extreme non-rigid personalization scenarios and is validated in challenging personalization scenarios. The method is evaluated on three different baselines and shows state-of-the-art performance. The ablation studies confirm the effectiveness of the design choices and emphasize the importance of each component. DreamMatcher is compatible with any existing T2I personalized models without any training or fine-tuning. It achieves state-of-the-art performance compared with existing tuning-free plug-in methods and even a learnable method. The method is effective even in extreme non-rigid personalization scenarios and is validated in challenging personalization scenarios. The ablation studies confirm the effectiveness of the design choices and emphasize the importance of each component.DreamMatcher is a plug-in method for text-to-image (T2I) personalization that enhances subject appearance while preserving target structure. It leverages semantic matching to align reference appearance with the target structure in the self-attention module of pre-trained T2I models. DreamMatcher replaces the target values with reference values aligned by semantic matching, while keeping the structure path unchanged to maintain the model's ability to generate diverse structures. A semantic-consistent masking strategy isolates the personalized concept from irrelevant regions, ensuring only the relevant parts of the reference are used. DreamMatcher is compatible with existing T2I models and shows significant improvements in complex scenarios. It outperforms existing tuning-free plug-in methods and even a learnable method in terms of performance. DreamMatcher is effective even in extreme non-rigid personalization scenarios and is validated in challenging personalization scenarios. The method is evaluated on three different baselines and shows state-of-the-art performance. The ablation studies confirm the effectiveness of the design choices and emphasize the importance of each component. DreamMatcher is compatible with any existing T2I personalized models without any training or fine-tuning. It achieves state-of-the-art performance compared with existing tuning-free plug-in methods and even a learnable method. The method is effective even in extreme non-rigid personalization scenarios and is validated in challenging personalization scenarios. The ablation studies confirm the effectiveness of the design choices and emphasize the importance of each component.
Reach us at info@study.space
[slides and audio] DreamMatcher%3A Appearance Matching Self-Attention for Semantically-Consistent Text-to-Image Personalization