An Empirical Study on the Intrinsic Privacy of SGD
Theory and Practice of Differential Privacy (CCS Worshop) |
We take the first step towards understanding whether the intrinsic randomness of stochastic gradient descent (SGD) can be leveraged for privacy, for any given dataset and model. In doing so, we hope to mitigate the trade-off between privacy and performance for models trained with differential-privacy (DP) guarantees. Our main contribution is a large-scale empirical analysis of SGD on convex and non-convex objectives, on four datasets. We evaluate the inherent variability in SGD and calculate the intrinsic data-dependent ϵi(D) values due to the inherent noise. We show that the variability in model parameters due to random sampling almost always exceeds that due to changes in the data. We show that the existing theoretical bound on the sensitivity of SGD with convex objectives is not tight. For logistic regression, we observe that SGD provides intrinsic ϵi(D) values between 3.95 and 23.10 across four datasets, dropping to between 1.25 and 4.22 using the tight empirical sensitivity bound. For neural networks, we report high ϵi(D) values (>40) owing to their larger parameter count. Next, we propose a method to augment the intrinsic noise of SGD to achieve the desired target ϵ. Our augmented SGD produces models that outperform existing approaches with the same privacy target, closing the gap to noiseless utility between 0.03% and 36.31% for logistic regression. We further explore the role of the number of steps of SGD, and demonstrate that our estimates are stable. Our experiments provide concrete evidence that changing the seed in SGD has a far greater impact on the model’s weights than excluding any given training example. By accounting for this intrinsic randomness – subject to necessary assumptions, we can achieve a consistent and statistically significant improvement in utility, without sacrificing further privacy.