This paper provides a comprehensive survey of efficient algorithms for computing Personalized PageRank (PPR), a measure of node proximity in graphs. PPR is defined as the probability that an α-discounted random walk starting from a source node $ s $ terminates at a target node $ t $, reflecting the mutual importance between $ s $ and $ t $. It is a generalization of Google's PageRank and has found applications in network analysis, graph mining, and graph machine learning. Despite extensive research, efficient computation of PPR remains challenging, and there is a lack of systematic summaries of existing algorithms.
The paper categorizes PPR algorithms based on the types of queries they address, such as single-source PPR (SSPPR), single-target PPR (STPPR), single-pair PPR (SPPPR), and top-$ k $ PPR. It reviews various techniques for PPR computation, including the Monte Carlo method, Power Iteration (PI), Forward Push, Reverse Power Iteration, and Backward Push. These methods are analyzed in terms of their theoretical guarantees, computational complexity, and suitability for different query types.
The paper also discusses algorithms for computing PPR on dynamic graphs and in parallel or distributed environments. It highlights the importance of approximation techniques, as exact computation can be prohibitively expensive for large graphs. Error measures such as $ \ell_1 $-error, $ \ell_2 $-error, absolute error, and relative error are discussed, along with their implications for algorithm design.
The paper concludes with a comparison of the basic techniques, emphasizing the trade-offs between accuracy, efficiency, and computational complexity. It also outlines future research directions, including the development of more scalable and efficient algorithms for PPR computation. Overall, the paper serves as a valuable resource for researchers and practitioners working on graph algorithms and machine learning.This paper provides a comprehensive survey of efficient algorithms for computing Personalized PageRank (PPR), a measure of node proximity in graphs. PPR is defined as the probability that an α-discounted random walk starting from a source node $ s $ terminates at a target node $ t $, reflecting the mutual importance between $ s $ and $ t $. It is a generalization of Google's PageRank and has found applications in network analysis, graph mining, and graph machine learning. Despite extensive research, efficient computation of PPR remains challenging, and there is a lack of systematic summaries of existing algorithms.
The paper categorizes PPR algorithms based on the types of queries they address, such as single-source PPR (SSPPR), single-target PPR (STPPR), single-pair PPR (SPPPR), and top-$ k $ PPR. It reviews various techniques for PPR computation, including the Monte Carlo method, Power Iteration (PI), Forward Push, Reverse Power Iteration, and Backward Push. These methods are analyzed in terms of their theoretical guarantees, computational complexity, and suitability for different query types.
The paper also discusses algorithms for computing PPR on dynamic graphs and in parallel or distributed environments. It highlights the importance of approximation techniques, as exact computation can be prohibitively expensive for large graphs. Error measures such as $ \ell_1 $-error, $ \ell_2 $-error, absolute error, and relative error are discussed, along with their implications for algorithm design.
The paper concludes with a comparison of the basic techniques, emphasizing the trade-offs between accuracy, efficiency, and computational complexity. It also outlines future research directions, including the development of more scalable and efficient algorithms for PPR computation. Overall, the paper serves as a valuable resource for researchers and practitioners working on graph algorithms and machine learning.