Revised June 22, 2005 | Kristen LeFevre, David J. DeWitt, Raghu Ramakrishnan
This paper introduces a new multidimensional model for k-anonymity, a mechanism used to protect privacy in microdata publishing. The model provides more flexibility compared to previous single-dimensional approaches, often leading to higher-quality anonymizations. The authors prove that optimal multidimensional anonymization is NP-hard but introduce a scalable greedy algorithm that produces anonymizations that are a constant-factor approximation of the optimal solution. Experimental results show that this greedy algorithm often produces more desirable anonymizations than optimal exhaustive-search algorithms for single-dimensional models. The paper also discusses the quality of anonymizations using general-purpose metrics and a workload-driven approach, and compares the performance of the greedy algorithm with optimal algorithms for single-dimensional models. The results indicate that the greedy algorithm frequently produces better quality results.This paper introduces a new multidimensional model for k-anonymity, a mechanism used to protect privacy in microdata publishing. The model provides more flexibility compared to previous single-dimensional approaches, often leading to higher-quality anonymizations. The authors prove that optimal multidimensional anonymization is NP-hard but introduce a scalable greedy algorithm that produces anonymizations that are a constant-factor approximation of the optimal solution. Experimental results show that this greedy algorithm often produces more desirable anonymizations than optimal exhaustive-search algorithms for single-dimensional models. The paper also discusses the quality of anonymizations using general-purpose metrics and a workload-driven approach, and compares the performance of the greedy algorithm with optimal algorithms for single-dimensional models. The results indicate that the greedy algorithm frequently produces better quality results.