Breaking the Length Barrier: LLM-Enhanced CTR Prediction in Long Textual User Behaviors

Breaking the Length Barrier: LLM-Enhanced CTR Prediction in Long Textual User Behaviors

2024 | Binzong Geng, Zhaoxin Huan, Xiaolu Zhang, Yong He, Liang Zhang, Fajie Yuan, Jun Zhou, Linjian Mo
This paper proposes Behavior Aggregated Hierarchical Encoding (BAHE) to address the efficiency challenge of using large language models (LLMs) in click-through rate (CTR) prediction with long user behavior sequences. Traditional LLM-based CTR models face efficiency issues due to the high computational cost of processing long sequences, which limits their practical application. BAHE introduces a novel hierarchical architecture that decouples the encoding of user behaviors from inter-behavior interactions, significantly improving efficiency. BAHE first uses the LLM's pre-trained shallow layers to extract embeddings of the most granular user behaviors from extensive sequences and stores them in an offline database. This reduces redundant encoding and allows for efficient reuse of behavior representations. Subsequently, the deeper, trainable layers of the LLM are used to model intricate inter-behavior interactions, generating comprehensive user embeddings. This separation enables the learning of high-level user representations to be independent of low-level behavior encoding, reducing computational complexity. The refined user embeddings, combined with processed item embeddings, are then used in the CTR model to compute CTR scores. BAHE has been successfully deployed in a real-world system, reducing training time and memory usage by five times for LLM-based CTR models, especially with longer user sequences. It enables daily updates of 50 million CTR data on 8 A100 GPUs, making LLMs practical for industrial CTR prediction. BAHE's main contributions include identifying the efficiency bottleneck in LLM-based CTR modeling with long user sequences and proposing a novel hierarchical structure that decouples behavior representation extraction from interaction learning. Extensive experiments show that BAHE significantly enhances the efficiency of LLM-based CTR models and demonstrates its ability to reduce computational resources in real-world applications. The method is model-agnostic and can be applied to any embedding-based CTR model.This paper proposes Behavior Aggregated Hierarchical Encoding (BAHE) to address the efficiency challenge of using large language models (LLMs) in click-through rate (CTR) prediction with long user behavior sequences. Traditional LLM-based CTR models face efficiency issues due to the high computational cost of processing long sequences, which limits their practical application. BAHE introduces a novel hierarchical architecture that decouples the encoding of user behaviors from inter-behavior interactions, significantly improving efficiency. BAHE first uses the LLM's pre-trained shallow layers to extract embeddings of the most granular user behaviors from extensive sequences and stores them in an offline database. This reduces redundant encoding and allows for efficient reuse of behavior representations. Subsequently, the deeper, trainable layers of the LLM are used to model intricate inter-behavior interactions, generating comprehensive user embeddings. This separation enables the learning of high-level user representations to be independent of low-level behavior encoding, reducing computational complexity. The refined user embeddings, combined with processed item embeddings, are then used in the CTR model to compute CTR scores. BAHE has been successfully deployed in a real-world system, reducing training time and memory usage by five times for LLM-based CTR models, especially with longer user sequences. It enables daily updates of 50 million CTR data on 8 A100 GPUs, making LLMs practical for industrial CTR prediction. BAHE's main contributions include identifying the efficiency bottleneck in LLM-based CTR modeling with long user sequences and proposing a novel hierarchical structure that decouples behavior representation extraction from interaction learning. Extensive experiments show that BAHE significantly enhances the efficiency of LLM-based CTR models and demonstrates its ability to reduce computational resources in real-world applications. The method is model-agnostic and can be applied to any embedding-based CTR model.
Reach us at info@study.space
Understanding Breaking the Length Barrier%3A LLM-Enhanced CTR Prediction in Long Textual User Behaviors