June 2007 | Jure Leskovec, Andreas Krause, Carlos Guestrin, Christos Faloutsos, Jeanne VanBriesen, Natalie Glance
This paper addresses the problem of detecting outbreaks in networks, such as placing sensors in a water distribution network to detect contaminants or selecting blogs to monitor information cascades. The authors propose a general methodology for near-optimal sensor placement, demonstrating that many realistic objectives (e.g., detection likelihood, population affected) exhibit submodularity. They develop an efficient algorithm, CELF, which scales to large problems and achieves near-optimal placements, up to 700 times faster than a simple greedy algorithm. The algorithm also handles cases with different node costs and provides online bounds on the quality of placements. Extensive evaluations on real-world problems, including a water distribution network and blog data, show that the approach is effective and scalable, providing insights into multicriteria trade-offs, cost-sensitivity, and generalization.This paper addresses the problem of detecting outbreaks in networks, such as placing sensors in a water distribution network to detect contaminants or selecting blogs to monitor information cascades. The authors propose a general methodology for near-optimal sensor placement, demonstrating that many realistic objectives (e.g., detection likelihood, population affected) exhibit submodularity. They develop an efficient algorithm, CELF, which scales to large problems and achieves near-optimal placements, up to 700 times faster than a simple greedy algorithm. The algorithm also handles cases with different node costs and provides online bounds on the quality of placements. Extensive evaluations on real-world problems, including a water distribution network and blog data, show that the approach is effective and scalable, providing insights into multicriteria trade-offs, cost-sensitivity, and generalization.