Nearly Tight Black-Box Auditing of Differentially Private Machine Learning

Nearly Tight Black-Box Auditing of Differentially Private Machine Learning

2 Nov 2024 | Meenatchi Sundaram Muthu Selva Annamalai, Emiliano De Cristofaro
This paper presents a novel auditing procedure for Differentially Private Stochastic Gradient Descent (DP-SGD) that achieves substantially tighter black-box audits than prior work. The main idea is to use worst-case initial model parameters, as DP-SGD's privacy analysis is agnostic to the choice of initial parameters. For models trained on MNIST and CIFAR-10 at theoretical ε=10.0, the auditing procedure yields empirical estimates of ε_emp=7.21 and 6.95 on a 1,000-record sample, and ε_emp=6.48 and 4.96 on the full datasets. These results are significantly tighter than previous audits, which were only relatively tight in stronger white-box models. The auditing procedure can provide valuable insights into how the privacy analysis of DP-SGD could be improved and detect bugs and DP violations in real-world implementations. The source code for reproducing the experiments is available. The paper discusses the threat model, where the adversary only sees the final model parameters and cannot observe intermediate steps or insert canary gradients. The auditing procedure involves running membership inference attacks to estimate empirical privacy leakage. The results show that the gap between worst-case empirical and theoretical privacy leakage in the black-box threat model is much smaller than previously thought. For models trained on MNIST and CIFAR-10 at theoretical ε=10.0, the audits achieve empirical privacy leakage estimates of ε_emp=6.48 and 4.96, respectively, compared to ε_emp=3.41 and 0.69 when using average-case initial parameters. The audits are also tighter for smaller datasets with 1,000 samples, achieving ε_emp=7.21 and 6.95 for MNIST and CIFAR-10. The paper also presents tight audits for models with only the last layer fine-tuned using DP-SGD. Specifically, it audits a 28-layer Wide-ResNet model pre-trained on ImageNet-32 with the last layer privately fine-tuned on CIFAR-10. With ε=10.0, the empirical privacy leakage estimate is ε_emp=8.30, compared to ε_emp=7.69 when using average-case initial parameters. The results show that the ability to perform rigorous audits of differentially private machine learning techniques allows shedding light on the tightness of theoretical guarantees in different settings and provides critical diagnostic tools to verify the correctness of implementations and libraries.This paper presents a novel auditing procedure for Differentially Private Stochastic Gradient Descent (DP-SGD) that achieves substantially tighter black-box audits than prior work. The main idea is to use worst-case initial model parameters, as DP-SGD's privacy analysis is agnostic to the choice of initial parameters. For models trained on MNIST and CIFAR-10 at theoretical ε=10.0, the auditing procedure yields empirical estimates of ε_emp=7.21 and 6.95 on a 1,000-record sample, and ε_emp=6.48 and 4.96 on the full datasets. These results are significantly tighter than previous audits, which were only relatively tight in stronger white-box models. The auditing procedure can provide valuable insights into how the privacy analysis of DP-SGD could be improved and detect bugs and DP violations in real-world implementations. The source code for reproducing the experiments is available. The paper discusses the threat model, where the adversary only sees the final model parameters and cannot observe intermediate steps or insert canary gradients. The auditing procedure involves running membership inference attacks to estimate empirical privacy leakage. The results show that the gap between worst-case empirical and theoretical privacy leakage in the black-box threat model is much smaller than previously thought. For models trained on MNIST and CIFAR-10 at theoretical ε=10.0, the audits achieve empirical privacy leakage estimates of ε_emp=6.48 and 4.96, respectively, compared to ε_emp=3.41 and 0.69 when using average-case initial parameters. The audits are also tighter for smaller datasets with 1,000 samples, achieving ε_emp=7.21 and 6.95 for MNIST and CIFAR-10. The paper also presents tight audits for models with only the last layer fine-tuned using DP-SGD. Specifically, it audits a 28-layer Wide-ResNet model pre-trained on ImageNet-32 with the last layer privately fine-tuned on CIFAR-10. With ε=10.0, the empirical privacy leakage estimate is ε_emp=8.30, compared to ε_emp=7.69 when using average-case initial parameters. The results show that the ability to perform rigorous audits of differentially private machine learning techniques allows shedding light on the tightness of theoretical guarantees in different settings and provides critical diagnostic tools to verify the correctness of implementations and libraries.
Reach us at info@study.space