MMGRec: Multimodal Generative Recommendation with Transformer Model

MMGRec: Multimodal Generative Recommendation with Transformer Model

25 Apr 2024 | Han Liu, Yinwei Wei, Xuemeng Song, Weili Guan, Yuan-Fang Li, Liqiang Nie
MMGRec is a novel multimodal recommendation system that introduces a generative paradigm to address the limitations of traditional embed-and-retrieve methods. The system assigns unique Rec-IDs to items based on their multimodal and collaborative filtering (CF) information using a Graph RQ-VAE model. These Rec-IDs consist of semantic tokens and a popularity token to ensure uniqueness. A Transformer-based model then generates Rec-IDs for user-preferred items based on historical interaction sequences. The system also incorporates a relation-aware self-attention mechanism to handle non-sequential interactions, improving the model's ability to capture pairwise relationships between items. Extensive experiments on three public datasets demonstrate that MMGRec achieves state-of-the-art performance with efficient inference. The model's key contributions include the introduction of the generative paradigm in multimodal recommendation, the design of a multimodal information quantization algorithm (Graph RQ-VAE), and the development of a relation-aware self-attention mechanism. The results show that MMGRec outperforms existing methods in terms of recommendation accuracy and efficiency. The system is effective in handling the challenges of multimodal recommendation, including false-negative issues and high inference costs.MMGRec is a novel multimodal recommendation system that introduces a generative paradigm to address the limitations of traditional embed-and-retrieve methods. The system assigns unique Rec-IDs to items based on their multimodal and collaborative filtering (CF) information using a Graph RQ-VAE model. These Rec-IDs consist of semantic tokens and a popularity token to ensure uniqueness. A Transformer-based model then generates Rec-IDs for user-preferred items based on historical interaction sequences. The system also incorporates a relation-aware self-attention mechanism to handle non-sequential interactions, improving the model's ability to capture pairwise relationships between items. Extensive experiments on three public datasets demonstrate that MMGRec achieves state-of-the-art performance with efficient inference. The model's key contributions include the introduction of the generative paradigm in multimodal recommendation, the design of a multimodal information quantization algorithm (Graph RQ-VAE), and the development of a relation-aware self-attention mechanism. The results show that MMGRec outperforms existing methods in terms of recommendation accuracy and efficiency. The system is effective in handling the challenges of multimodal recommendation, including false-negative issues and high inference costs.
Reach us at info@study.space