2011 | MANUEL GOMEZ-RODRIGUEZ, JURE LESKOVEC, ANDREAS KRAUSE
The paper addresses the challenge of inferring networks of diffusion and influence, where the underlying network is often unobserved. The authors propose a method to trace paths of diffusion and influence through networks, aiming to reconstruct the network over which contagions propagate. Given the times when nodes adopt information or become infected, the goal is to identify the optimal network that best explains these infection times. Since the optimization problem is NP-hard, an efficient approximation algorithm is developed that scales to large datasets and finds near-optimal networks.
The effectiveness of the approach is demonstrated through a study of information diffusion in a dataset of 170 million blogs and news articles over a one-year period. The results show that the diffusion network of news for the top 1,000 media sites and blogs tends to have a core-periphery structure, with a small set of core media sites diffusing information to the rest of the Web. These core sites have stable circles of influence, and general news media sites act as connectors between them.
The paper also introduces the NETINF algorithm, which efficiently solves the network inference problem by considering only the most likely propagation trees for each cascade. The algorithm is shown to outperform a baseline heuristic by an order of magnitude and correctly discovers more than 90% of the edges. The results highlight the importance of understanding how information or viruses propagate over networks, providing insights into the roles and influence of nodes in the diffusion process.The paper addresses the challenge of inferring networks of diffusion and influence, where the underlying network is often unobserved. The authors propose a method to trace paths of diffusion and influence through networks, aiming to reconstruct the network over which contagions propagate. Given the times when nodes adopt information or become infected, the goal is to identify the optimal network that best explains these infection times. Since the optimization problem is NP-hard, an efficient approximation algorithm is developed that scales to large datasets and finds near-optimal networks.
The effectiveness of the approach is demonstrated through a study of information diffusion in a dataset of 170 million blogs and news articles over a one-year period. The results show that the diffusion network of news for the top 1,000 media sites and blogs tends to have a core-periphery structure, with a small set of core media sites diffusing information to the rest of the Web. These core sites have stable circles of influence, and general news media sites act as connectors between them.
The paper also introduces the NETINF algorithm, which efficiently solves the network inference problem by considering only the most likely propagation trees for each cascade. The algorithm is shown to outperform a baseline heuristic by an order of magnitude and correctly discovers more than 90% of the edges. The results highlight the importance of understanding how information or viruses propagate over networks, providing insights into the roles and influence of nodes in the diffusion process.