2024 | Priyanka Nanayakkara, Hyoek Kim, Yifan Wu, Ali Sarvghad, Narges Mahyar, Gerome Miklau, Jessica Hullman
The MEASURE-OBSERVE-REMEASURE paradigm is an interactive approach for differentially private exploratory analysis. It allows analysts to iteratively measure, observe, and remeasure queries to efficiently allocate the privacy loss budget (ε). Analysts start by measuring a query with a limited ε, observe the estimate and its error, and then remeasure with more ε if higher accuracy is needed. This process helps analysts spend ε efficiently without knowing all queries in advance.
The paradigm is implemented in an interactive visualization interface that enables analysts to spend increasing amounts of ε under a total budget. A user study compared the utility of ε allocations and findings from sensitive data participants to those expected of a rational agent. Participants successfully used the workflow, allocating ε strategies that maximized over half the available utility. Their performance loss relative to a rational agent was driven more by their inability to access and report information than by ε allocation.
Differential privacy (DP) ensures privacy-preserving analysis by adding noise to query results. The privacy loss budget (ε) controls the amount of noise added, with smaller ε values providing stronger privacy but lower accuracy. Analysts must carefully allocate ε across queries to maximize accuracy while respecting privacy constraints.
The MEASURE-OBSERVE-REMEASURE paradigm addresses the challenge of efficiently spending ε in exploratory data analysis (EDA), where analysts determine subsequent queries based on earlier results. This iterative process can lead to wasted ε if initial queries are not accurate enough. The paradigm allows analysts to remeasure queries as needed, improving estimates at each step.
The workflow leverages the High Dimensional Matrix Mechanism (HDMM) to answer queries under DP. HDMM allows for efficient use of ε by combining noisy estimates from multiple queries. The interface provides visualizations of query estimates and error bars, enabling analysts to assess accuracy and decide whether to remeasure.
A user study evaluated the effectiveness of the paradigm, comparing participants' performance to rational agent benchmarks. Participants demonstrated the ability to allocate ε effectively, with their performance loss primarily due to limited access to information rather than suboptimal ε allocation. The study highlights the potential of the MEASURE-OBSERVE-REMEASURE paradigm to support analysts in efficiently spending ε while conducting exploratory analysis under differential privacy.The MEASURE-OBSERVE-REMEASURE paradigm is an interactive approach for differentially private exploratory analysis. It allows analysts to iteratively measure, observe, and remeasure queries to efficiently allocate the privacy loss budget (ε). Analysts start by measuring a query with a limited ε, observe the estimate and its error, and then remeasure with more ε if higher accuracy is needed. This process helps analysts spend ε efficiently without knowing all queries in advance.
The paradigm is implemented in an interactive visualization interface that enables analysts to spend increasing amounts of ε under a total budget. A user study compared the utility of ε allocations and findings from sensitive data participants to those expected of a rational agent. Participants successfully used the workflow, allocating ε strategies that maximized over half the available utility. Their performance loss relative to a rational agent was driven more by their inability to access and report information than by ε allocation.
Differential privacy (DP) ensures privacy-preserving analysis by adding noise to query results. The privacy loss budget (ε) controls the amount of noise added, with smaller ε values providing stronger privacy but lower accuracy. Analysts must carefully allocate ε across queries to maximize accuracy while respecting privacy constraints.
The MEASURE-OBSERVE-REMEASURE paradigm addresses the challenge of efficiently spending ε in exploratory data analysis (EDA), where analysts determine subsequent queries based on earlier results. This iterative process can lead to wasted ε if initial queries are not accurate enough. The paradigm allows analysts to remeasure queries as needed, improving estimates at each step.
The workflow leverages the High Dimensional Matrix Mechanism (HDMM) to answer queries under DP. HDMM allows for efficient use of ε by combining noisy estimates from multiple queries. The interface provides visualizations of query estimates and error bars, enabling analysts to assess accuracy and decide whether to remeasure.
A user study evaluated the effectiveness of the paradigm, comparing participants' performance to rational agent benchmarks. Participants demonstrated the ability to allocate ε effectively, with their performance loss primarily due to limited access to information rather than suboptimal ε allocation. The study highlights the potential of the MEASURE-OBSERVE-REMEASURE paradigm to support analysts in efficiently spending ε while conducting exploratory analysis under differential privacy.