[slides and audio] Item-based top-N recommendation algorithms

This technical report, authored by Mukund Deshpande and George Karypis from the University of Minnesota, focuses on item-based top-N recommendation algorithms. The authors address the scalability issues of user-based collaborative filtering (CF) systems, which can become computationally expensive with large datasets. Item-based recommendation techniques analyze the user-item matrix to identify relationships between items and use these relationships to compute recommendations. The key steps in the proposed item-based algorithms include: 1. Computing similarity between items using methods like cosine similarity or conditional probability. 2. Combining these similarities to determine the similarity between a basket of items and a candidate item. The report presents two main methods for computing item similarities: - **Cosine-Based Similarity**: Measures similarity using the cosine of the vectors representing items in the user space. - **Conditional Probability-Based Similarity**: Uses conditional probabilities to measure similarity, adjusted for item frequencies. The authors also introduce a method to normalize similarities to account for different densities of item neighborhoods, improving recommendation quality. Additionally, they propose higher-order item-based models that consider combinations of items (itemsets) up to a certain size to enhance recommendation accuracy. Experimental evaluations on nine real datasets and 36 synthetic datasets show that the proposed item-based algorithms are up to two orders of magnitude faster than traditional user-neighborhood based recommender systems while achieving comparable or better quality in recommendation performance. The report concludes with a discussion on the effectiveness of similarity normalization and row normalization, and the sensitivity of model size to recommendation accuracy.This technical report, authored by Mukund Deshpande and George Karypis from the University of Minnesota, focuses on item-based top-N recommendation algorithms. The authors address the scalability issues of user-based collaborative filtering (CF) systems, which can become computationally expensive with large datasets. Item-based recommendation techniques analyze the user-item matrix to identify relationships between items and use these relationships to compute recommendations. The key steps in the proposed item-based algorithms include: 1. Computing similarity between items using methods like cosine similarity or conditional probability. 2. Combining these similarities to determine the similarity between a basket of items and a candidate item. The report presents two main methods for computing item similarities: - **Cosine-Based Similarity**: Measures similarity using the cosine of the vectors representing items in the user space. - **Conditional Probability-Based Similarity**: Uses conditional probabilities to measure similarity, adjusted for item frequencies. The authors also introduce a method to normalize similarities to account for different densities of item neighborhoods, improving recommendation quality. Additionally, they propose higher-order item-based models that consider combinations of items (itemsets) up to a certain size to enhance recommendation accuracy. Experimental evaluations on nine real datasets and 36 synthetic datasets show that the proposed item-based algorithms are up to two orders of magnitude faster than traditional user-neighborhood based recommender systems while achieving comparable or better quality in recommendation performance. The report concludes with a discussion on the effectiveness of similarity normalization and row normalization, and the sensitivity of model size to recommendation accuracy.

Item-Based Top-N Recommendation Algorithms

January 20, 2003 | Mukund Deshpande and George Karypis