TokenRec: Learning to Tokenize ID for LLM-based Generative Recommendations

TokenRec: Learning to Tokenize ID for LLM-based Generative Recommendations

2024 | Haohao Qu, Wenqi Fan, Zihua Zhao, Qing Li, Fellow, IEEE
TokenRec: Learning to Tokenize ID for LLM-based Generative Recommendations This paper proposes TokenRec, a novel framework for LLM-based generative recommendations that addresses the challenge of tokenizing users and items for seamless alignment with large language models (LLMs). The key contributions of TokenRec include a novel tokenization strategy that integrates high-order collaborative knowledge into LLMs and a generative retrieval paradigm that efficiently recommends top-K items without the need for time-consuming auto-regressive decoding. The tokenization strategy involves quantizing masked user/item representations from collaborative filtering into discrete tokens, enabling generalizable tokenization for LLM-based recommendation systems. The generative retrieval paradigm eliminates the need for beam search and reduces inference time by directly generating item representations and retrieving appropriate items for recommendations. Comprehensive experiments on four real-world datasets demonstrate that TokenRec outperforms competitive benchmarks, including traditional and LLM-based recommender systems. The proposed method achieves superior recommendation performance and generalization ability in predicting new and unseen users' preferences. The framework is evaluated on the Amazon-Beauty, Amazon-Clothing, LastFM, and MovieLens 1M datasets, and compared with state-of-the-art collaborative filtering, sequential recommendation, and LLM-based recommendation methods. The results show that TokenRec consistently outperforms all baselines across all datasets, particularly in terms of Hit Ratio (HR@K) and Normalized Discounted Cumulative Gain (NDCG@K). The method is also evaluated for its generalizability to new users and items, demonstrating its ability to provide robust ID tokenization without retraining the LLM backbone. The proposed framework is efficient, scalable, and effective for LLM-based recommendation systems.TokenRec: Learning to Tokenize ID for LLM-based Generative Recommendations This paper proposes TokenRec, a novel framework for LLM-based generative recommendations that addresses the challenge of tokenizing users and items for seamless alignment with large language models (LLMs). The key contributions of TokenRec include a novel tokenization strategy that integrates high-order collaborative knowledge into LLMs and a generative retrieval paradigm that efficiently recommends top-K items without the need for time-consuming auto-regressive decoding. The tokenization strategy involves quantizing masked user/item representations from collaborative filtering into discrete tokens, enabling generalizable tokenization for LLM-based recommendation systems. The generative retrieval paradigm eliminates the need for beam search and reduces inference time by directly generating item representations and retrieving appropriate items for recommendations. Comprehensive experiments on four real-world datasets demonstrate that TokenRec outperforms competitive benchmarks, including traditional and LLM-based recommender systems. The proposed method achieves superior recommendation performance and generalization ability in predicting new and unseen users' preferences. The framework is evaluated on the Amazon-Beauty, Amazon-Clothing, LastFM, and MovieLens 1M datasets, and compared with state-of-the-art collaborative filtering, sequential recommendation, and LLM-based recommendation methods. The results show that TokenRec consistently outperforms all baselines across all datasets, particularly in terms of Hit Ratio (HR@K) and Normalized Discounted Cumulative Gain (NDCG@K). The method is also evaluated for its generalizability to new users and items, demonstrating its ability to provide robust ID tokenization without retraining the LLM backbone. The proposed framework is efficient, scalable, and effective for LLM-based recommendation systems.
Reach us at info@study.space
[slides] TokenRec%3A Learning to Tokenize ID for LLM-based Generative Recommendation | StudySpace