September 15-19, 2016 | Paul Covington, Jay Adams, Emre Sargin
This paper presents the application of deep neural networks in YouTube's recommendation system, which is one of the largest and most complex recommendation systems in the world. The system is divided into two stages: candidate generation and ranking. The candidate generation model uses deep learning to select a subset of videos from a large corpus that are likely to be relevant to the user. The ranking model then uses deep learning to rank these candidates based on expected watch time, rather than click probability. The paper discusses the challenges of scaling, freshness, and noise in the recommendation system, and how deep learning helps overcome these challenges. The system uses a large-scale distributed training framework, TensorFlow, and has models with over one billion parameters. The paper also discusses the importance of using implicit feedback, such as video watches, and the use of embeddings to represent sparse features. The system also incorporates the age of the training example as a feature to account for the non-stationary nature of video popularity. The paper concludes that deep learning significantly improves the performance of the recommendation system, and that the use of deep neural networks is a general-purpose solution for many learning problems. The system has been tested through A/B testing and has shown improvements in both offline and online metrics. The paper also discusses the importance of feature engineering, the use of embeddings for categorical features, and the normalization of continuous features. The results show that increasing the depth and width of the neural network improves performance, but also increases the computational cost. The paper highlights the effectiveness of deep learning in modeling complex interactions between features and the importance of using appropriate loss functions for ranking.This paper presents the application of deep neural networks in YouTube's recommendation system, which is one of the largest and most complex recommendation systems in the world. The system is divided into two stages: candidate generation and ranking. The candidate generation model uses deep learning to select a subset of videos from a large corpus that are likely to be relevant to the user. The ranking model then uses deep learning to rank these candidates based on expected watch time, rather than click probability. The paper discusses the challenges of scaling, freshness, and noise in the recommendation system, and how deep learning helps overcome these challenges. The system uses a large-scale distributed training framework, TensorFlow, and has models with over one billion parameters. The paper also discusses the importance of using implicit feedback, such as video watches, and the use of embeddings to represent sparse features. The system also incorporates the age of the training example as a feature to account for the non-stationary nature of video popularity. The paper concludes that deep learning significantly improves the performance of the recommendation system, and that the use of deep neural networks is a general-purpose solution for many learning problems. The system has been tested through A/B testing and has shown improvements in both offline and online metrics. The paper also discusses the importance of feature engineering, the use of embeddings for categorical features, and the normalization of continuous features. The results show that increasing the depth and width of the neural network improves performance, but also increases the computational cost. The paper highlights the effectiveness of deep learning in modeling complex interactions between features and the importance of using appropriate loss functions for ranking.