June 17, 2015 | Julian McAuley*1, Christopher Targett†2, Qinfeng ('Javen') Shi‡2, and Anton van den Hengel§2,3
This paper presents a method for image-based recommendations on styles and substitutes, aiming to model the human sense of visual relationships between objects based on their appearance. The approach does not rely on fine-grained user annotations but instead uses a large-scale dataset and a scalable method to uncover human notions of visual relationships. The system is designed to recommend which clothes and accessories will go well together, among other applications.
The authors propose a visual and relational recommender system that models human visual preferences by analyzing the appearance of objects. The system is based on a network inference problem defined on graphs of related images and uses a large-scale dataset for training and evaluation. The dataset, called the Styles and Substitutes dataset, contains over 180 million relationships between nearly 6 million objects, derived from Amazon's product recommendations.
The system uses a convolutional neural network to calculate feature vectors for each object and learns a parameterized distance transform to model the relationships between objects. The distance transform is based on a shifted sigmoid function, which relates distance to probability. The authors propose several distance functions, including weighted nearest neighbor and Mahalanobis transform, to model the relationships between objects.
The system is evaluated on various categories, including books, movies, music, and clothing. The results show that the proposed method outperforms both category-based methods and weighted nearest neighbor approaches. The method is also able to personalize recommendations based on individual user preferences.
The authors also demonstrate the ability of the system to generate recommendations for users of a web store, based on the visual style of the query item. The system is able to recommend items that complement the query item, such as a shirt, shoes, or accessories that belong to the same style.
The paper concludes that the proposed method is capable of modeling a variety of visual relationships beyond simple visual similarity. It is the first attempt to model human preference for the appearance of one object given that of another in terms of more than just the visual similarity between the two. The method is also able to provide external validation of the learned model by assessing the coordination of outfits observed in real-world scenarios.This paper presents a method for image-based recommendations on styles and substitutes, aiming to model the human sense of visual relationships between objects based on their appearance. The approach does not rely on fine-grained user annotations but instead uses a large-scale dataset and a scalable method to uncover human notions of visual relationships. The system is designed to recommend which clothes and accessories will go well together, among other applications.
The authors propose a visual and relational recommender system that models human visual preferences by analyzing the appearance of objects. The system is based on a network inference problem defined on graphs of related images and uses a large-scale dataset for training and evaluation. The dataset, called the Styles and Substitutes dataset, contains over 180 million relationships between nearly 6 million objects, derived from Amazon's product recommendations.
The system uses a convolutional neural network to calculate feature vectors for each object and learns a parameterized distance transform to model the relationships between objects. The distance transform is based on a shifted sigmoid function, which relates distance to probability. The authors propose several distance functions, including weighted nearest neighbor and Mahalanobis transform, to model the relationships between objects.
The system is evaluated on various categories, including books, movies, music, and clothing. The results show that the proposed method outperforms both category-based methods and weighted nearest neighbor approaches. The method is also able to personalize recommendations based on individual user preferences.
The authors also demonstrate the ability of the system to generate recommendations for users of a web store, based on the visual style of the query item. The system is able to recommend items that complement the query item, such as a shirt, shoes, or accessories that belong to the same style.
The paper concludes that the proposed method is capable of modeling a variety of visual relationships beyond simple visual similarity. It is the first attempt to model human preference for the appearance of one object given that of another in terms of more than just the visual similarity between the two. The method is also able to provide external validation of the learned model by assessing the coordination of outfits observed in real-world scenarios.