Diffusion Models for Generative Outfit Recommendation

Diffusion Models for Generative Outfit Recommendation

July 14-18, 2024 | Yiyan Xu, Wenjie Wang, Fuli Feng, Yunshan Ma, Jizhi Zhang, Xiangnan He
This paper introduces a novel task called Generative Outfit Recommendation (GOR), aiming to generate a set of fashion images and compose them into a visually compatible outfit tailored to specific users. The key objectives of GOR are high fidelity, compatibility, and personalization of generated outfits. To achieve these, we propose a generative outfit recommender model named DiFashion, which empowers exceptional diffusion models to accomplish the parallel generation of multiple fashion images. To ensure these objectives, we design three kinds of conditions to guide the parallel generation process and adopt Classifier-Free-Guidance to enhance the alignment between the generated images and conditions. We apply DiFashion on both personalized Fill-In-The-Blank and GOR tasks and conduct extensive experiments on iFashion and Polyvore-U datasets. The quantitative and human-involved qualitative evaluation demonstrate the superiority of DiFashion over competitive baselines. DiFashion is a generative outfit recommender model adapted from diffusion models. It comprises two critical processes: gradually corrupting fashion images within the same outfit with Gaussian noise in the forward process, followed by a conditional denoising process to generate these images in parallel during the reverse phase. Three conditions, i.e., category prompt, mutual condition, and history condition, are the keys to guiding the parallel generation process to pursue the three criteria of GOR. Specifically, 1) for high fidelity, DiFashion employs category prompts to ensure category consistency and adopts Classifier-Free-Guidance for the three conditions to enhance image quality and alignment between the generated images and conditions, 2) to ensure compatibility, a mutual encoder is designed to encode fashion images within the same outfit at different noise levels into compatibility information, serving as the mutual condition, and 3) for personalization, DiFashion includes a history encoder that leverages users' historical interactions with fashion products, capturing their personalized tastes as the history condition. We instantiate DiFashion on two tasks: Personalized Fill-In-The-Blank (PFITB) and GOR tasks. Given users' interaction history and designated categories, the PFITB task involves generating a fashion product that seamlessly complements an incomplete outfit, while the GOR task requires producing a whole compatible outfit. We conduct extensive experiments on iFashion and Polyvore-U datasets and compare DiFashion with various baselines, including generative models and retrieval-based models, demonstrating the superiority of our proposed DiFashion in both PFITB and GOR tasks. We have released our code and data at https://github.com/YiyanXu/DiFashion. The key contributions of this work are as follows: 1) We propose a new generative outfit recommendation task, using Generative AI to create personalized outfits for users. This initiative pioneers a promising avenue for outfit recommendation and contributes to a more personalized fashion landscape. 2) We present DiFashion, a generative outfit recommender model, which adapts at the parallel generation of multiple fashion images, skillfully pursuing various generation objectives. 3) WeThis paper introduces a novel task called Generative Outfit Recommendation (GOR), aiming to generate a set of fashion images and compose them into a visually compatible outfit tailored to specific users. The key objectives of GOR are high fidelity, compatibility, and personalization of generated outfits. To achieve these, we propose a generative outfit recommender model named DiFashion, which empowers exceptional diffusion models to accomplish the parallel generation of multiple fashion images. To ensure these objectives, we design three kinds of conditions to guide the parallel generation process and adopt Classifier-Free-Guidance to enhance the alignment between the generated images and conditions. We apply DiFashion on both personalized Fill-In-The-Blank and GOR tasks and conduct extensive experiments on iFashion and Polyvore-U datasets. The quantitative and human-involved qualitative evaluation demonstrate the superiority of DiFashion over competitive baselines. DiFashion is a generative outfit recommender model adapted from diffusion models. It comprises two critical processes: gradually corrupting fashion images within the same outfit with Gaussian noise in the forward process, followed by a conditional denoising process to generate these images in parallel during the reverse phase. Three conditions, i.e., category prompt, mutual condition, and history condition, are the keys to guiding the parallel generation process to pursue the three criteria of GOR. Specifically, 1) for high fidelity, DiFashion employs category prompts to ensure category consistency and adopts Classifier-Free-Guidance for the three conditions to enhance image quality and alignment between the generated images and conditions, 2) to ensure compatibility, a mutual encoder is designed to encode fashion images within the same outfit at different noise levels into compatibility information, serving as the mutual condition, and 3) for personalization, DiFashion includes a history encoder that leverages users' historical interactions with fashion products, capturing their personalized tastes as the history condition. We instantiate DiFashion on two tasks: Personalized Fill-In-The-Blank (PFITB) and GOR tasks. Given users' interaction history and designated categories, the PFITB task involves generating a fashion product that seamlessly complements an incomplete outfit, while the GOR task requires producing a whole compatible outfit. We conduct extensive experiments on iFashion and Polyvore-U datasets and compare DiFashion with various baselines, including generative models and retrieval-based models, demonstrating the superiority of our proposed DiFashion in both PFITB and GOR tasks. We have released our code and data at https://github.com/YiyanXu/DiFashion. The key contributions of this work are as follows: 1) We propose a new generative outfit recommendation task, using Generative AI to create personalized outfits for users. This initiative pioneers a promising avenue for outfit recommendation and contributes to a more personalized fashion landscape. 2) We present DiFashion, a generative outfit recommender model, which adapts at the parallel generation of multiple fashion images, skillfully pursuing various generation objectives. 3) We
Reach us at info@study.space
[slides and audio] Diffusion Models for Generative Outfit Recommendation