Rethinking Large Language Model Architectures for Sequential Recommendations

Rethinking Large Language Model Architectures for Sequential Recommendations

2018 | Hanbing Wang, Xiaorui Liu, Wenqi Fan, Xiangyu Zhao, Venkataramana Kini, Devendra Yadav, Fei Wang, Zhen Wen, Jiliang Tang, Hui Liu
This paper proposes Lite-LLM4Rec, a simplified yet effective large language model (LLM)-based sequential recommendation model that achieves efficient inference and improved performance. The authors identify that beam search decoding is the most time-consuming component in LLM-based recommendation systems and that existing item indexing methods lead to redundant computations. To address these issues, Lite-LLM4Rec introduces a hierarchical LLM structure that encodes item context into compact vectors, reducing input length and computational costs. It also replaces beam search decoding with an item projection head that directly generates recommendations, eliminating the need for complex decoding processes. The model is trained using cross-entropy loss and achieves significant improvements in both inference efficiency and recommendation performance. Experiments on three public datasets show that Lite-LLM4Rec outperforms existing LLM-based methods in terms of performance and efficiency, with a 46.8% improvement in Recall@10 on the ML-1m dataset. The model's hierarchical structure and item projection head are key innovations that enable efficient and effective sequential recommendation. The authors also conduct ablation studies to evaluate the effectiveness of different components and find that the hierarchical LLM structure and item projection head are crucial for performance. The results demonstrate that Lite-LLM4Rec is a promising solution for efficient sequential recommendation.This paper proposes Lite-LLM4Rec, a simplified yet effective large language model (LLM)-based sequential recommendation model that achieves efficient inference and improved performance. The authors identify that beam search decoding is the most time-consuming component in LLM-based recommendation systems and that existing item indexing methods lead to redundant computations. To address these issues, Lite-LLM4Rec introduces a hierarchical LLM structure that encodes item context into compact vectors, reducing input length and computational costs. It also replaces beam search decoding with an item projection head that directly generates recommendations, eliminating the need for complex decoding processes. The model is trained using cross-entropy loss and achieves significant improvements in both inference efficiency and recommendation performance. Experiments on three public datasets show that Lite-LLM4Rec outperforms existing LLM-based methods in terms of performance and efficiency, with a 46.8% improvement in Recall@10 on the ML-1m dataset. The model's hierarchical structure and item projection head are key innovations that enable efficient and effective sequential recommendation. The authors also conduct ablation studies to evaluate the effectiveness of different components and find that the hierarchical LLM structure and item projection head are crucial for performance. The results demonstrate that Lite-LLM4Rec is a promising solution for efficient sequential recommendation.
Reach us at info@study.space