Learnable Item Tokenization for Generative Recommendation

Learnable Item Tokenization for Generative Recommendation

October 21–25, 2024 | Wenjie Wang, Honghui Bao, Xinyu Lin, Jizhi Zhang, Yongqi Li, Fuli Feng, See-Kiong Ng, and Tat-Seng Chua
The paper "Learnable Item Tokenization for Generative Recommendation" addresses the challenge of transforming recommendation data into the language space of Large Language Models (LLMs) through effective item tokenization. Current approaches, such as ID, textual, and codebook-based identifiers, have limitations in encoding semantic information, incorporating collaborative signals, or handling code assignment bias. To address these issues, the authors propose LETTER (a LEarnerable Tokenizer for generaTive REcommendation), which integrates hierarchical semantics, collaborative signals, and code assignment diversity. LETTER incorporates Residual Quantized VAE for semantic regularization, a contrastive alignment loss for collaborative regularization, and a diversity loss to mitigate code assignment bias. The method is applied to two generative recommender models, and a ranking-guided generation loss is proposed to enhance their ranking ability. Extensive experiments on three datasets validate the superiority of LETTER, advancing the state-of-the-art in LLM-based generative recommendation. The contributions of the work include the proposal of LETTER, its instantiation on generative recommender models, and the theoretical and empirical validation of its effectiveness.The paper "Learnable Item Tokenization for Generative Recommendation" addresses the challenge of transforming recommendation data into the language space of Large Language Models (LLMs) through effective item tokenization. Current approaches, such as ID, textual, and codebook-based identifiers, have limitations in encoding semantic information, incorporating collaborative signals, or handling code assignment bias. To address these issues, the authors propose LETTER (a LEarnerable Tokenizer for generaTive REcommendation), which integrates hierarchical semantics, collaborative signals, and code assignment diversity. LETTER incorporates Residual Quantized VAE for semantic regularization, a contrastive alignment loss for collaborative regularization, and a diversity loss to mitigate code assignment bias. The method is applied to two generative recommender models, and a ranking-guided generation loss is proposed to enhance their ranking ability. Extensive experiments on three datasets validate the superiority of LETTER, advancing the state-of-the-art in LLM-based generative recommendation. The contributions of the work include the proposal of LETTER, its instantiation on generative recommender models, and the theoretical and empirical validation of its effectiveness.
Reach us at info@study.space
Understanding Learnable Item Tokenization for Generative Recommendation