1999 | Jon M. Kleinberg¹, Ravi Kumar², Prabhakar Raghavan², Sridhar Rajagopalan², and Andrew S. Tomkins²
The Web can be viewed as a directed graph, with web pages as nodes and hyperlinks as edges. This graph has hundreds of millions of nodes and over a billion links, growing exponentially over time. Studying this graph is important for mathematical, sociological, and commercial reasons. This paper presents two algorithms for analyzing the Web graph: Kleinberg's HITS method and the enumeration of bipartite cliques. These algorithms help with web search and community discovery. The paper also reports measurements of the Web graph, such as the in- and out-degrees of nodes following Zipfian distributions, suggesting that traditional random graph models like G_n,p are inadequate for modeling the Web graph. The authors propose a new family of random graph models that incorporate a copying process, where links are added by copying from existing nodes. These models better explain the observed statistics and suggest that the mathematical analysis of such models is more complex than traditional ones. The paper also discusses related work, including the use of Web graph analysis to improve web search and the observation that Zipfian distributions characterize citation frequencies in both web and academic literature, as described by Lotka's law.The Web can be viewed as a directed graph, with web pages as nodes and hyperlinks as edges. This graph has hundreds of millions of nodes and over a billion links, growing exponentially over time. Studying this graph is important for mathematical, sociological, and commercial reasons. This paper presents two algorithms for analyzing the Web graph: Kleinberg's HITS method and the enumeration of bipartite cliques. These algorithms help with web search and community discovery. The paper also reports measurements of the Web graph, such as the in- and out-degrees of nodes following Zipfian distributions, suggesting that traditional random graph models like G_n,p are inadequate for modeling the Web graph. The authors propose a new family of random graph models that incorporate a copying process, where links are added by copying from existing nodes. These models better explain the observed statistics and suggest that the mathematical analysis of such models is more complex than traditional ones. The paper also discusses related work, including the use of Web graph analysis to improve web search and the observation that Zipfian distributions characterize citation frequencies in both web and academic literature, as described by Lotka's law.