The paper examines the privacy guarantees of the Adaptive Batch Linear Queries (ABLQ) mechanism under different types of batch sampling: shuffling and Poisson subsampling. While shuffling-based Differentially Private Stochastic Gradient Descent (DP-SGD) is more commonly used in practice, it has not been amenable to easy privacy analysis. In contrast, Poisson subsampling-based DP-SGD has a well-understood privacy analysis but is challenging to implement scalably. The study shows that there can be a substantial gap between the privacy analysis when using these two types of batch sampling, advising caution in reporting privacy parameters for DP-SGD.
The authors formally define the privacy loss curve of ABLQ for different batch samplers and demonstrate that:
- Shuffling (D) always provides stronger privacy guarantees than deterministic batching (S).
- The privacy guarantees of ABLQ with deterministic batching and Poisson subsampling (P) are incomparable, with ABLQ(S) potentially providing worse privacy guarantees than ABLQ(P) for large values of ε.
- For sufficiently large ε, ABLQ(S) provides worse privacy guarantees than ABLQ(P).
The paper also provides numerical examples and theoretical proofs to support these findings, highlighting the importance of choosing the appropriate batch sampler for accurate privacy analysis in DP-SGD implementations.The paper examines the privacy guarantees of the Adaptive Batch Linear Queries (ABLQ) mechanism under different types of batch sampling: shuffling and Poisson subsampling. While shuffling-based Differentially Private Stochastic Gradient Descent (DP-SGD) is more commonly used in practice, it has not been amenable to easy privacy analysis. In contrast, Poisson subsampling-based DP-SGD has a well-understood privacy analysis but is challenging to implement scalably. The study shows that there can be a substantial gap between the privacy analysis when using these two types of batch sampling, advising caution in reporting privacy parameters for DP-SGD.
The authors formally define the privacy loss curve of ABLQ for different batch samplers and demonstrate that:
- Shuffling (D) always provides stronger privacy guarantees than deterministic batching (S).
- The privacy guarantees of ABLQ with deterministic batching and Poisson subsampling (P) are incomparable, with ABLQ(S) potentially providing worse privacy guarantees than ABLQ(P) for large values of ε.
- For sufficiently large ε, ABLQ(S) provides worse privacy guarantees than ABLQ(P).
The paper also provides numerical examples and theoretical proofs to support these findings, highlighting the importance of choosing the appropriate batch sampler for accurate privacy analysis in DP-SGD implementations.