by , , , , ,
Abstract:
Given a water distribution network, where should we place sensors to quickly detect contam- inants? Or, which blogs should we read to avoid missing important stories? These seemingly different problems share common structure: Outbreak detection can be mod- eled as selecting nodes (sensor locations, blogs) in a network, in order to detect the spreading of a virus or information as quickly as possible. We present a general methodology for near optimal sensor placement in these and related problems. We demonstrate that many realistic outbreak detection objectives (e.g., detection likelihood, population affected) exhibit the property of ``submodularity''. We exploit sub- modularity to develop an efficient algorithm that scales to large problems, achieving near optimal placements, while being 700 times faster than a simple greedy algorithm. We also derive online bounds on the quality of the placements obtained by any algorithm. Our al- gorithms and bounds also handle cases where nodes (sensor locations, blogs) have different costs. We evaluate our approach on several large real-world problems, including a model of a water distribution network from the EPA, and real blog data. The obtained sensor placements are provably near optimal, providing a constant fraction of the optimal solution. We show that the approach scales, achieving speedups and savings in storage of several orders of magni- tude. We also show how the approach leads to deeper insights in both applications, answering multicriteria trade-off, cost-sensitivity and generalization questions.
Reference:
Cost-effective Outbreak Detection in Networks J. Leskovec, A. Krause, C. Guestrin, C. Faloutsos, J. VanBriesen, N. GlanceIn ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD), 2007Winner of the Best Paper Award
Bibtex Entry:
@inproceedings{leskovec07cost,
	author = {Jure Leskovec and Andreas Krause and Carlos Guestrin and Christos Faloutsos and Jeanne VanBriesen and Natalie Glance},
	booktitle = {ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD)},
	location = {San Jose, California},
	month = {August},
	pages = {420--429},
	title = {Cost-effective Outbreak Detection in Networks},
	year = {2007}}