May 13–17, 2024, Singapore | Marco De Nadai1*, Francesco Fabbri1*, Paul Giglioli1, Alice Wang1, Ang Li1, Fabrizio Silvestri1,2, Laura Kim1, Shawn Lin1, Vladan Radosavljevic1, Sandeep Ghael1, David Nyhan1, Hugues Bouchard1, Mounia Lalmas-Roelleke1, Andreas Damianou1
Spotify has introduced audiobooks to its platform, presenting challenges for personalized recommendations due to data sparsity and the need for relevance. To address these issues, the team at Spotify has developed 2T-HGNN, a scalable recommendation system that combines Heterogeneous Graph Neural Networks (HGNNs) and a Two Tower (2T) model. This approach leverages podcast and music user preferences to uncover nuanced item relationships while ensuring low latency and complexity. The model decouples users from the HGNN graph and introduces a multi-link neighbor sampler to optimize training. Empirical evaluations show a significant improvement in personalized recommendations, with a 46% increase in new audiobook start rates and a 23% boost in streaming rates. The model's effectiveness extends beyond audiobooks to benefit established products like podcasts. The key contributions include the first-scale investigation of audiobook recommendation systems, a modular architecture that integrates audiobooks into Spotify's existing recommendation platform, and the design of a balanced neighborhood sampler to address data imbalance. The 2T-HGNN model is now in production, serving millions of users.Spotify has introduced audiobooks to its platform, presenting challenges for personalized recommendations due to data sparsity and the need for relevance. To address these issues, the team at Spotify has developed 2T-HGNN, a scalable recommendation system that combines Heterogeneous Graph Neural Networks (HGNNs) and a Two Tower (2T) model. This approach leverages podcast and music user preferences to uncover nuanced item relationships while ensuring low latency and complexity. The model decouples users from the HGNN graph and introduces a multi-link neighbor sampler to optimize training. Empirical evaluations show a significant improvement in personalized recommendations, with a 46% increase in new audiobook start rates and a 23% boost in streaming rates. The model's effectiveness extends beyond audiobooks to benefit established products like podcasts. The key contributions include the first-scale investigation of audiobook recommendation systems, a modular architecture that integrates audiobooks into Spotify's existing recommendation platform, and the design of a balanced neighborhood sampler to address data imbalance. The 2T-HGNN model is now in production, serving millions of users.