DKN: Deep Knowledge-Aware Network for News Recommendation
Hongwei Wang, Fuzheng Zhang, Xing Xie, and Minyi Guo propose a deep knowledge-aware network (DKN) for news recommendation that incorporates knowledge graph representation into news recommendation. DKN is a content-based deep recommendation framework for click-through rate prediction. The key component of DKN is a multi-channel and word-entity-aligned knowledge-aware convolutional neural network (KCNN) that fuses semantic-level and knowledge-level representations of news. KCNN treats words and entities as multiple channels and explicitly keeps their alignment relationship during convolution. Additionally, an attention module is designed in DKN to dynamically aggregate a user's history with respect to current candidate news. Through extensive experiments on a real online news platform, the authors demonstrate that DKN achieves substantial gains over state-of-the-art deep recommendation models. They also validate the efficacy of the usage of knowledge in DKN. The framework of DKN is illustrated in Figure 3. The process of knowledge distillation is illustrated in Figure 4, which consists of four steps. First, entity linking is used to disambiguate mentions in texts by associating them with predefined entities in a knowledge graph. Based on these identified entities, a sub-graph is constructed and all relational links among them are extracted from the original knowledge graph. The context of entity e is defined as the set of its immediate neighbors in the knowledge graph. The context embedding is calculated as the average of its contextual entities. The architecture of KCNN is illustrated in the left lower part of Figure 3. For each news title, the word embeddings, transformed entity embeddings, and transformed context embeddings are used as input. The embeddings of the user and the candidate news are concatenated and fed into a deep neural network (DNN) to calculate the predicted probability that the user will click the candidate news. The attention-based user interest extraction is presented in Section 4.4. The results show that DKN achieves substantial gains over state-of-the-art deep-learning-based methods for recommendation. The results also prove that the usage of knowledge and an attention module can bring additional improvements in the DKN framework. The authors also present a visualization result of attention values to intuitively demonstrate the efficacy of the usage of the knowledge graph in Section 5.5. The experiments show that DKN outperforms other baselines in terms of F1 and AUC scores. The results also show that the usage of knowledge and an attention module can bring additional improvements in the DKN framework. The authors also present a case study about user's reading interests and make discussions on tuning hyper-parameters. The dataset comes from the server logs of Bing News. The results show that DKN is competitive and robust in practical application. The authors also compare DKN with baselines and demonstrate the efficacy of the design of the DKN framework. The results show that DKN outperforms other baselines in terms of FDKN: Deep Knowledge-Aware Network for News Recommendation
Hongwei Wang, Fuzheng Zhang, Xing Xie, and Minyi Guo propose a deep knowledge-aware network (DKN) for news recommendation that incorporates knowledge graph representation into news recommendation. DKN is a content-based deep recommendation framework for click-through rate prediction. The key component of DKN is a multi-channel and word-entity-aligned knowledge-aware convolutional neural network (KCNN) that fuses semantic-level and knowledge-level representations of news. KCNN treats words and entities as multiple channels and explicitly keeps their alignment relationship during convolution. Additionally, an attention module is designed in DKN to dynamically aggregate a user's history with respect to current candidate news. Through extensive experiments on a real online news platform, the authors demonstrate that DKN achieves substantial gains over state-of-the-art deep recommendation models. They also validate the efficacy of the usage of knowledge in DKN. The framework of DKN is illustrated in Figure 3. The process of knowledge distillation is illustrated in Figure 4, which consists of four steps. First, entity linking is used to disambiguate mentions in texts by associating them with predefined entities in a knowledge graph. Based on these identified entities, a sub-graph is constructed and all relational links among them are extracted from the original knowledge graph. The context of entity e is defined as the set of its immediate neighbors in the knowledge graph. The context embedding is calculated as the average of its contextual entities. The architecture of KCNN is illustrated in the left lower part of Figure 3. For each news title, the word embeddings, transformed entity embeddings, and transformed context embeddings are used as input. The embeddings of the user and the candidate news are concatenated and fed into a deep neural network (DNN) to calculate the predicted probability that the user will click the candidate news. The attention-based user interest extraction is presented in Section 4.4. The results show that DKN achieves substantial gains over state-of-the-art deep-learning-based methods for recommendation. The results also prove that the usage of knowledge and an attention module can bring additional improvements in the DKN framework. The authors also present a visualization result of attention values to intuitively demonstrate the efficacy of the usage of the knowledge graph in Section 5.5. The experiments show that DKN outperforms other baselines in terms of F1 and AUC scores. The results also show that the usage of knowledge and an attention module can bring additional improvements in the DKN framework. The authors also present a case study about user's reading interests and make discussions on tuning hyper-parameters. The dataset comes from the server logs of Bing News. The results show that DKN is competitive and robust in practical application. The authors also compare DKN with baselines and demonstrate the efficacy of the design of the DKN framework. The results show that DKN outperforms other baselines in terms of F