Search | arXiv e-print repository

Range Membership Inference Attacks

Abstract: Machine learning models can leak private information about their training data, but the standard methods to measure this risk, based on membership inference attacks (MIAs), have a major limitation. They only check if a given data point \textit{exactly} matches a training point, neglecting the potential of similar or partially overlapping data revealing the same private information. To address this… ▽ More Machine learning models can leak private information about their training data, but the standard methods to measure this risk, based on membership inference attacks (MIAs), have a major limitation. They only check if a given data point \textit{exactly} matches a training point, neglecting the potential of similar or partially overlapping data revealing the same private information. To address this issue, we introduce the class of range membership inference attacks (RaMIAs), testing if the model was trained on any data in a specified range (defined based on the semantics of privacy). We formulate the RaMIAs game and design a principled statistical test for its complex hypotheses. We show that RaMIAs can capture privacy loss more accurately and comprehensively than MIAs on various types of data, such as tabular, image, and language. RaMIA paves the way for a more comprehensive and meaningful privacy auditing of machine learning algorithms. △ Less

Submitted 9 August, 2024; originally announced August 2024.

arXiv:2407.14206 [pdf, ps, other]

Watermark Smoothing Attacks against Language Models

Authors: Hongyan Chang, Hamed Hassani, Reza Shokri

Abstract: Watermarking is a technique used to embed a hidden signal in the probability distribution of text generated by large language models (LLMs), enabling attribution of the text to the originating model. We introduce smoothing attacks and show that existing watermarking methods are not robust against minor modifications of text. An adversary can use weaker language models to smooth out the distributio… ▽ More Watermarking is a technique used to embed a hidden signal in the probability distribution of text generated by large language models (LLMs), enabling attribution of the text to the originating model. We introduce smoothing attacks and show that existing watermarking methods are not robust against minor modifications of text. An adversary can use weaker language models to smooth out the distribution perturbations caused by watermarks without significantly compromising the quality of the generated text. The modified text resulting from the smoothing attack remains close to the distribution of text that the original model (without watermark) would have produced. Our attack reveals a fundamental limitation of a wide range of watermarking techniques. △ Less

Submitted 19 July, 2024; originally announced July 2024.

arXiv:2405.19471 [pdf, other]

The Data Minimization Principle in Machine Learning

Authors: Prakhar Ganesh, Cuong Tran, Reza Shokri, Ferdinando Fioretto

Abstract: The principle of data minimization aims to reduce the amount of data collected, processed or retained to minimize the potential for misuse, unauthorized access, or data breaches. Rooted in privacy-by-design principles, data minimization has been endorsed by various global data protection regulations. However, its practical implementation remains a challenge due to the lack of a rigorous formulatio… ▽ More The principle of data minimization aims to reduce the amount of data collected, processed or retained to minimize the potential for misuse, unauthorized access, or data breaches. Rooted in privacy-by-design principles, data minimization has been endorsed by various global data protection regulations. However, its practical implementation remains a challenge due to the lack of a rigorous formulation. This paper addresses this gap and introduces an optimization framework for data minimization based on its legal definitions. It then adapts several optimization algorithms to perform data minimization and conducts a comprehensive evaluation in terms of their compliance with minimization objectives as well as their impact on user privacy. Our analysis underscores the mismatch between the privacy expectations of data minimization and the actual privacy benefits, emphasizing the need for approaches that account for multiple facets of real-world privacy risks. △ Less

Submitted 29 May, 2024; originally announced May 2024.

arXiv:2312.03262 [pdf, other]

Low-Cost High-Power Membership Inference Attacks

Authors: Sajjad Zarifzadeh, Philippe Liu, Reza Shokri

Abstract: Membership inference attacks aim to detect if a particular data point was used in training a model. We design a novel statistical test to perform robust membership inference attacks (RMIA) with low computational overhead. We achieve this by a fine-grained modeling of the null hypothesis in our likelihood ratio tests, and effectively leveraging both reference models and reference population data sa… ▽ More Membership inference attacks aim to detect if a particular data point was used in training a model. We design a novel statistical test to perform robust membership inference attacks (RMIA) with low computational overhead. We achieve this by a fine-grained modeling of the null hypothesis in our likelihood ratio tests, and effectively leveraging both reference models and reference population data samples. RMIA has superior test power compared with prior methods, throughout the TPR-FPR curve (even at extremely low FPR, as low as 0). Under computational constraints, where only a limited number of pre-trained reference models (as few as 1) are available, and also when we vary other elements of the attack (e.g., data distribution), our method performs exceptionally well, unlike prior attacks that approach random guessing. RMIA lays the groundwork for practical yet accurate data privacy risk assessment in machine learning. △ Less

Submitted 12 June, 2024; v1 submitted 5 December, 2023; originally announced December 2023.

Comments: ICML 2024

arXiv:2310.20579 [pdf, other]

Initialization Matters: Privacy-Utility Analysis of Overparameterized Neural Networks

Authors: Jiayuan Ye, Zhenyu Zhu, Fanghui Liu, Reza Shokri, Volkan Cevher

Abstract: We analytically investigate how over-parameterization of models in randomized machine learning algorithms impacts the information leakage about their training data. Specifically, we prove a privacy bound for the KL divergence between model distributions on worst-case neighboring datasets, and explore its dependence on the initialization, width, and depth of fully connected neural networks. We find… ▽ More We analytically investigate how over-parameterization of models in randomized machine learning algorithms impacts the information leakage about their training data. Specifically, we prove a privacy bound for the KL divergence between model distributions on worst-case neighboring datasets, and explore its dependence on the initialization, width, and depth of fully connected neural networks. We find that this KL privacy bound is largely determined by the expected squared gradient norm relative to model parameters during training. Notably, for the special setting of linearized network, our analysis indicates that the squared gradient norm (and therefore the escalation of privacy loss) is tied directly to the per-layer variance of the initialization distribution. By using this analysis, we demonstrate that privacy bound improves with increasing depth under certain initializations (LeCun and Xavier), while degrades with increasing depth under other initializations (He and NTK). Our work reveals a complex interplay between privacy and depth that depends on the chosen initialization distribution. We further prove excess empirical risk bounds under a fixed KL privacy budget, and show that the interplay between privacy utility trade-off and depth is similarly affected by the initialization. △ Less

Submitted 31 October, 2023; originally announced October 2023.

arXiv:2310.19973 [pdf, other]

Unified Enhancement of Privacy Bounds for Mixture Mechanisms via $f$-Differential Privacy

Authors: Chendi Wang, Buxin Su, Jiayuan Ye, Reza Shokri, Weijie J. Su

Abstract: Differentially private (DP) machine learning algorithms incur many sources of randomness, such as random initialization, random batch subsampling, and shuffling. However, such randomness is difficult to take into account when proving differential privacy bounds because it induces mixture distributions for the algorithm's output that are difficult to analyze. This paper focuses on improving privacy… ▽ More Differentially private (DP) machine learning algorithms incur many sources of randomness, such as random initialization, random batch subsampling, and shuffling. However, such randomness is difficult to take into account when proving differential privacy bounds because it induces mixture distributions for the algorithm's output that are difficult to analyze. This paper focuses on improving privacy bounds for shuffling models and one-iteration differentially private gradient descent (DP-GD) with random initializations using $f$-DP. We derive a closed-form expression of the trade-off function for shuffling models that outperforms the most up-to-date results based on $(ε,δ)$-DP. Moreover, we investigate the effects of random initialization on the privacy of one-iteration DP-GD. Our numerical computations of the trade-off function indicate that random initialization can enhance the privacy of DP-GD. Our analysis of $f$-DP guarantees for these mixture mechanisms relies on an inequality for trade-off functions introduced in this paper. This inequality implies the joint convexity of $F$-divergences. Finally, we study an $f$-DP analog of the advanced joint convexity of the hockey-stick divergence related to $(ε,δ)$-DP and apply it to analyze the privacy of mixture mechanisms. △ Less

Submitted 1 November, 2023; v1 submitted 30 October, 2023; originally announced October 2023.

arXiv:2310.17884 [pdf, other]

Can LLMs Keep a Secret? Testing Privacy Implications of Language Models via Contextual Integrity Theory

Authors: Niloofar Mireshghallah, Hyunwoo Kim, Xuhui Zhou, Yulia Tsvetkov, Maarten Sap, Reza Shokri, Yejin Choi

Abstract: The interactive use of large language models (LLMs) in AI assistants (at work, home, etc.) introduces a new set of inference-time privacy risks: LLMs are fed different types of information from multiple sources in their inputs and are expected to reason about what to share in their outputs, for what purpose and with whom, within a given context. In this work, we draw attention to the highly critic… ▽ More The interactive use of large language models (LLMs) in AI assistants (at work, home, etc.) introduces a new set of inference-time privacy risks: LLMs are fed different types of information from multiple sources in their inputs and are expected to reason about what to share in their outputs, for what purpose and with whom, within a given context. In this work, we draw attention to the highly critical yet overlooked notion of contextual privacy by proposing ConfAIde, a benchmark designed to identify critical weaknesses in the privacy reasoning capabilities of instruction-tuned LLMs. Our experiments show that even the most capable models such as GPT-4 and ChatGPT reveal private information in contexts that humans would not, 39% and 57% of the time, respectively. This leakage persists even when we employ privacy-inducing prompts or chain-of-thought reasoning. Our work underscores the immediate need to explore novel inference-time privacy-preserving approaches, based on reasoning and theory of mind. △ Less

Submitted 28 June, 2024; v1 submitted 27 October, 2023; originally announced October 2023.

Comments: 2024 ICLR Spotlight. The dataset and code can be found at https://confaide.github.io

arXiv:2309.17310 [pdf, other]

Leave-one-out Distinguishability in Machine Learning

Authors: Jiayuan Ye, Anastasia Borovykh, Soufiane Hayou, Reza Shokri

Abstract: We introduce an analytical framework to quantify the changes in a machine learning algorithm's output distribution following the inclusion of a few data points in its training set, a notion we define as leave-one-out distinguishability (LOOD). This is key to measuring data **memorization** and information **leakage** as well as the **influence** of training data points in machine learning. We illu… ▽ More We introduce an analytical framework to quantify the changes in a machine learning algorithm's output distribution following the inclusion of a few data points in its training set, a notion we define as leave-one-out distinguishability (LOOD). This is key to measuring data **memorization** and information **leakage** as well as the **influence** of training data points in machine learning. We illustrate how our method broadens and refines existing empirical measures of memorization and privacy risks associated with training data. We use Gaussian processes to model the randomness of machine learning algorithms, and validate LOOD with extensive empirical analysis of leakage using membership inference attacks. Our analytical framework enables us to investigate the causes of leakage and where the leakage is high. For example, we analyze the influence of activation functions, on data memorization. Additionally, our method allows us to identify queries that disclose the most information about the training data in the leave-one-out setting. We illustrate how optimal queries can be used for accurate **reconstruction** of training data. △ Less

Submitted 17 April, 2024; v1 submitted 29 September, 2023; originally announced September 2023.

Comments: ICLR 2024

arXiv:2309.05505 [pdf, other]

Share Your Representation Only: Guaranteed Improvement of the Privacy-Utility Tradeoff in Federated Learning

Authors: Zebang Shen, Jiayuan Ye, Anmin Kang, Hamed Hassani, Reza Shokri

Abstract: Repeated parameter sharing in federated learning causes significant information leakage about private data, thus defeating its main purpose: data privacy. Mitigating the risk of this information leakage, using state of the art differentially private algorithms, also does not come for free. Randomized mechanisms can prevent convergence of models on learning even the useful representation functions,… ▽ More Repeated parameter sharing in federated learning causes significant information leakage about private data, thus defeating its main purpose: data privacy. Mitigating the risk of this information leakage, using state of the art differentially private algorithms, also does not come for free. Randomized mechanisms can prevent convergence of models on learning even the useful representation functions, especially if there is more disagreement between local models on the classification functions (due to data heterogeneity). In this paper, we consider a representation federated learning objective that encourages various parties to collaboratively refine the consensus part of the model, with differential privacy guarantees, while separately allowing sufficient freedom for local personalization (without releasing it). We prove that in the linear representation setting, while the objective is non-convex, our proposed new algorithm \DPFEDREP\ converges to a ball centered around the \emph{global optimal} solution at a linear rate, and the radius of the ball is proportional to the reciprocal of the privacy budget. With this novel utility analysis, we improve the SOTA utility-privacy trade-off for this problem by a factor of $\sqrt{d}$, where $d$ is the input dimension. We empirically evaluate our method with the image classification task on CIFAR10, CIFAR100, and EMNIST, and observe a significant performance improvement over the prior work under the same small privacy budget. The code can be found in this link: https://github.com/shenzebang/CENTAUR-Privacy-Federated-Representation-Learning. △ Less

Submitted 11 September, 2023; originally announced September 2023.

Comments: ICLR 2023 revised

arXiv:2309.02160 [pdf, other]

Bias Propagation in Federated Learning

Authors: Hongyan Chang, Reza Shokri

Abstract: We show that participating in federated learning can be detrimental to group fairness. In fact, the bias of a few parties against under-represented groups (identified by sensitive attributes such as gender or race) can propagate through the network to all the parties in the network. We analyze and explain bias propagation in federated learning on naturally partitioned real-world datasets. Our anal… ▽ More We show that participating in federated learning can be detrimental to group fairness. In fact, the bias of a few parties against under-represented groups (identified by sensitive attributes such as gender or race) can propagate through the network to all the parties in the network. We analyze and explain bias propagation in federated learning on naturally partitioned real-world datasets. Our analysis reveals that biased parties unintentionally yet stealthily encode their bias in a small number of model parameters, and throughout the training, they steadily increase the dependence of the global model on sensitive attributes. What is important to highlight is that the experienced bias in federated learning is higher than what parties would otherwise encounter in centralized training with a model trained on the union of all their data. This indicates that the bias is due to the algorithm. Our work calls for auditing group fairness in federated learning and designing learning algorithms that are robust to bias propagation. △ Less

Submitted 5 September, 2023; originally announced September 2023.

Journal ref: The Eleventh International Conference on Learning Representations, 2023

arXiv:2307.04138 [pdf, other]

doi 10.1145/3593013.3594116

On The Impact of Machine Learning Randomness on Group Fairness

Authors: Prakhar Ganesh, Hongyan Chang, Martin Strobel, Reza Shokri

Abstract: Statistical measures for group fairness in machine learning reflect the gap in performance of algorithms across different groups. These measures, however, exhibit a high variance between different training instances, which makes them unreliable for empirical evaluation of fairness. What causes this high variance? We investigate the impact on group fairness of different sources of randomness in tra… ▽ More Statistical measures for group fairness in machine learning reflect the gap in performance of algorithms across different groups. These measures, however, exhibit a high variance between different training instances, which makes them unreliable for empirical evaluation of fairness. What causes this high variance? We investigate the impact on group fairness of different sources of randomness in training neural networks. We show that the variance in group fairness measures is rooted in the high volatility of the learning process on under-represented groups. Further, we recognize the dominant source of randomness as the stochasticity of data order during training. Based on these findings, we show how one can control group-level accuracy (i.e., model fairness), with high efficiency and negligible impact on the model's overall performance, by simply changing the data order for a single epoch. △ Less

Submitted 9 July, 2023; originally announced July 2023.

Comments: 10 pages + Appendix

arXiv:2305.09859 [pdf, other]

Smaller Language Models are Better Black-box Machine-Generated Text Detectors

Authors: Niloofar Mireshghallah, Justus Mattern, Sicun Gao, Reza Shokri, Taylor Berg-Kirkpatrick

Abstract: With the advent of fluent generative language models that can produce convincing utterances very similar to those written by humans, distinguishing whether a piece of text is machine-generated or human-written becomes more challenging and more important, as such models could be used to spread misinformation, fake news, fake reviews and to mimic certain authors and figures. To this end, there have… ▽ More With the advent of fluent generative language models that can produce convincing utterances very similar to those written by humans, distinguishing whether a piece of text is machine-generated or human-written becomes more challenging and more important, as such models could be used to spread misinformation, fake news, fake reviews and to mimic certain authors and figures. To this end, there have been a slew of methods proposed to detect machine-generated text. Most of these methods need access to the logits of the target model or need the ability to sample from the target. One such black-box detection method relies on the observation that generated text is locally optimal under the likelihood function of the generator, while human-written text is not. We find that overall, smaller and partially-trained models are better universal text detectors: they can more precisely detect text generated from both small and larger models. Interestingly, we find that whether the detector and generator were trained on the same data is not critically important to the detection success. For instance the OPT-125M model has an AUC of 0.81 in detecting ChatGPT generations, whereas a larger model from the GPT family, GPTJ-6B, has AUC of 0.45. △ Less

Submitted 24 February, 2024; v1 submitted 16 May, 2023; originally announced May 2023.

arXiv:2209.06529 [pdf, other]

doi 10.1109/MSEC.2022.3178187

Data Privacy and Trustworthy Machine Learning

Authors: Martin Strobel, Reza Shokri

Abstract: The privacy risks of machine learning models is a major concern when training them on sensitive and personal data. We discuss the tradeoffs between data privacy and the remaining goals of trustworthy machine learning (notably, fairness, robustness, and explainability). The privacy risks of machine learning models is a major concern when training them on sensitive and personal data. We discuss the tradeoffs between data privacy and the remaining goals of trustworthy machine learning (notably, fairness, robustness, and explainability). △ Less

Submitted 14 September, 2022; originally announced September 2022.

Journal ref: Published in: IEEE Security & Privacy ( Volume: 20, Issue: 5, Sept.-Oct. 2022)

arXiv:2204.00032 [pdf, other]

Truth Serum: Poisoning Machine Learning Models to Reveal Their Secrets

Authors: Florian Tramèr, Reza Shokri, Ayrton San Joaquin, Hoang Le, Matthew Jagielski, Sanghyun Hong, Nicholas Carlini

Abstract: We introduce a new class of attacks on machine learning models. We show that an adversary who can poison a training dataset can cause models trained on this dataset to leak significant private details of training points belonging to other parties. Our active inference attacks connect two independent lines of work targeting the integrity and privacy of machine learning training data. Our attacks… ▽ More We introduce a new class of attacks on machine learning models. We show that an adversary who can poison a training dataset can cause models trained on this dataset to leak significant private details of training points belonging to other parties. Our active inference attacks connect two independent lines of work targeting the integrity and privacy of machine learning training data. Our attacks are effective across membership inference, attribute inference, and data extraction. For example, our targeted attacks can poison <0.1% of the training dataset to boost the performance of inference attacks by 1 to 2 orders of magnitude. Further, an adversary who controls a significant fraction of the training data (e.g., 50%) can launch untargeted attacks that enable 8x more precise inference on all other users' otherwise-private data points. Our results cast doubts on the relevance of cryptographic privacy guarantees in multiparty computation protocols for machine learning, if parties can arbitrarily select their share of training data. △ Less

Submitted 6 October, 2022; v1 submitted 31 March, 2022; originally announced April 2022.

Comments: ACM CCS 2022

arXiv:2203.05363 [pdf, other]

Differentially Private Learning Needs Hidden State (Or Much Faster Convergence)

Authors: Jiayuan Ye, Reza Shokri

Abstract: Prior work on differential privacy analysis of randomized SGD algorithms relies on composition theorems, where the implicit (unrealistic) assumption is that the internal state of the iterative algorithm is revealed to the adversary. As a result, the Rényi DP bounds derived by such composition-based analyses linearly grow with the number of training epochs. When the internal state of the algorithm… ▽ More Prior work on differential privacy analysis of randomized SGD algorithms relies on composition theorems, where the implicit (unrealistic) assumption is that the internal state of the iterative algorithm is revealed to the adversary. As a result, the Rényi DP bounds derived by such composition-based analyses linearly grow with the number of training epochs. When the internal state of the algorithm is hidden, we prove a converging privacy bound for noisy stochastic gradient descent (on strongly convex smooth loss functions). We show how to take advantage of privacy amplification by sub-sampling and randomized post-processing, and prove the dynamics of privacy bound for "shuffle and partition" and "sample without replacement" stochastic mini-batch gradient descent schemes. We prove that, in these settings, our privacy bound converges exponentially fast and is substantially smaller than the composition bounds, notably after a few number of training epochs. Thus, unless the DP algorithm converges fast, our privacy analysis shows that hidden state analysis can significantly amplify differential privacy. △ Less

Submitted 17 October, 2022; v1 submitted 10 March, 2022; originally announced March 2022.

arXiv:2203.03929 [pdf, other]

Quantifying Privacy Risks of Masked Language Models Using Membership Inference Attacks

Authors: Fatemehsadat Mireshghallah, Kartik Goyal, Archit Uniyal, Taylor Berg-Kirkpatrick, Reza Shokri

Abstract: The wide adoption and application of Masked language models~(MLMs) on sensitive data (from legal to medical) necessitates a thorough quantitative investigation into their privacy vulnerabilities -- to what extent do MLMs leak information about their training data? Prior attempts at measuring leakage of MLMs via membership inference attacks have been inconclusive, implying the potential robustness… ▽ More The wide adoption and application of Masked language models~(MLMs) on sensitive data (from legal to medical) necessitates a thorough quantitative investigation into their privacy vulnerabilities -- to what extent do MLMs leak information about their training data? Prior attempts at measuring leakage of MLMs via membership inference attacks have been inconclusive, implying the potential robustness of MLMs to privacy attacks. In this work, we posit that prior attempts were inconclusive because they based their attack solely on the MLM's model score. We devise a stronger membership inference attack based on likelihood ratio hypothesis testing that involves an additional reference MLM to more accurately quantify the privacy risks of memorization in MLMs. We show that masked language models are extremely susceptible to likelihood ratio membership inference attacks: Our empirical results, on models trained on medical notes, show that our attack improves the AUC of prior membership inference attacks from 0.66 to an alarmingly high 0.90 level, with a significant improvement in the low-error region: at 1% false positive rate, our attack is 51X more powerful than prior work. △ Less

Submitted 3 November, 2022; v1 submitted 8 March, 2022; originally announced March 2022.

arXiv:2202.05520 [pdf, other]

What Does it Mean for a Language Model to Preserve Privacy?

Authors: Hannah Brown, Katherine Lee, Fatemehsadat Mireshghallah, Reza Shokri, Florian Tramèr

Abstract: Natural language reflects our private lives and identities, making its privacy concerns as broad as those of real life. Language models lack the ability to understand the context and sensitivity of text, and tend to memorize phrases present in their training sets. An adversary can exploit this tendency to extract training data. Depending on the nature of the content and the context in which this d… ▽ More Natural language reflects our private lives and identities, making its privacy concerns as broad as those of real life. Language models lack the ability to understand the context and sensitivity of text, and tend to memorize phrases present in their training sets. An adversary can exploit this tendency to extract training data. Depending on the nature of the content and the context in which this data was collected, this could violate expectations of privacy. Thus there is a growing interest in techniques for training language models that preserve privacy. In this paper, we discuss the mismatch between the narrow assumptions made by popular data protection techniques (data sanitization and differential privacy), and the broadness of natural language and of privacy as a social norm. We argue that existing protection methods cannot guarantee a generic and meaningful notion of privacy for language models. We conclude that language models should be trained on text data which was explicitly produced for public use. △ Less

Submitted 14 February, 2022; v1 submitted 11 February, 2022; originally announced February 2022.

Comments: 21 pages, 2 figures

arXiv:2111.09679 [pdf, other]

Enhanced Membership Inference Attacks against Machine Learning Models

Authors: Jiayuan Ye, Aadyaa Maddi, Sasi Kumar Murakonda, Vincent Bindschaedler, Reza Shokri

Abstract: How much does a machine learning algorithm leak about its training data, and why? Membership inference attacks are used as an auditing tool to quantify this leakage. In this paper, we present a comprehensive \textit{hypothesis testing framework} that enables us not only to formally express the prior work in a consistent way, but also to design new membership inference attacks that use reference mo… ▽ More How much does a machine learning algorithm leak about its training data, and why? Membership inference attacks are used as an auditing tool to quantify this leakage. In this paper, we present a comprehensive \textit{hypothesis testing framework} that enables us not only to formally express the prior work in a consistent way, but also to design new membership inference attacks that use reference models to achieve a significantly higher power (true positive rate) for any (false positive rate) error. More importantly, we explain \textit{why} different attacks perform differently. We present a template for indistinguishability games, and provide an interpretation of attack success rate across different instances of the game. We discuss various uncertainties of attackers that arise from the formulation of the problem, and show how our approach tries to minimize the attack uncertainty to the one bit secret about the presence or absence of a data point in the training set. We perform a \textit{differential analysis} between all types of attacks, explain the gap between them, and show what causes data points to be vulnerable to an attack (as the reasons vary due to different granularities of memorization, from overfitting to conditional memorization). Our auditing framework is openly accessible as part of the \textit{Privacy Meter} software tool. △ Less

Submitted 13 September, 2022; v1 submitted 18 November, 2021; originally announced November 2021.

Comments: To appear at ACM CCS 2022

arXiv:2102.05855 [pdf, ps, other]

Differential Privacy Dynamics of Langevin Diffusion and Noisy Gradient Descent

Authors: Rishav Chourasia, Jiayuan Ye, Reza Shokri

Abstract: What is the information leakage of an iterative randomized learning algorithm about its training data, when the internal state of the algorithm is \emph{private}? How much is the contribution of each specific training epoch to the information leakage through the released model? We study this problem for noisy gradient descent algorithms, and model the \emph{dynamics} of Rényi differential privacy… ▽ More What is the information leakage of an iterative randomized learning algorithm about its training data, when the internal state of the algorithm is \emph{private}? How much is the contribution of each specific training epoch to the information leakage through the released model? We study this problem for noisy gradient descent algorithms, and model the \emph{dynamics} of Rényi differential privacy loss throughout the training process. Our analysis traces a provably \emph{tight} bound on the Rényi divergence between the pair of probability distributions over parameters of models trained on neighboring datasets. We prove that the privacy loss converges exponentially fast, for smooth and strongly convex loss functions, which is a significant improvement over composition theorems (which over-estimate the privacy loss by upper-bounding its total value over all intermediate gradient computations). For Lipschitz, smooth, and strongly convex loss functions, we prove optimal utility with a small gradient complexity for noisy gradient descent algorithms. △ Less

Submitted 8 September, 2022; v1 submitted 11 February, 2021; originally announced February 2021.

arXiv:2011.03731 [pdf, other]

On the Privacy Risks of Algorithmic Fairness

Authors: Hongyan Chang, Reza Shokri

Abstract: Algorithmic fairness and privacy are essential pillars of trustworthy machine learning. Fair machine learning aims at minimizing discrimination against protected groups by, for example, imposing a constraint on models to equalize their behavior across different groups. This can subsequently change the influence of training data points on the fair model, in a disproportionate way. We study how this… ▽ More Algorithmic fairness and privacy are essential pillars of trustworthy machine learning. Fair machine learning aims at minimizing discrimination against protected groups by, for example, imposing a constraint on models to equalize their behavior across different groups. This can subsequently change the influence of training data points on the fair model, in a disproportionate way. We study how this can change the information leakage of the model about its training data. We analyze the privacy risks of group fairness (e.g., equalized odds) through the lens of membership inference attacks: inferring whether a data point is used for training a model. We show that fairness comes at the cost of privacy, and this cost is not distributed equally: the information leakage of fair models increases significantly on the unprivileged subgroups, which are the ones for whom we need fair learning. We show that the more biased the training data is, the higher the privacy cost of achieving fairness for the unprivileged subgroups will be. We provide comprehensive empirical analysis for general machine learning algorithms. △ Less

Submitted 7 April, 2021; v1 submitted 7 November, 2020; originally announced November 2020.

arXiv:2007.12934 [pdf, other]

SOTERIA: In Search of Efficient Neural Networks for Private Inference

Authors: Anshul Aggarwal, Trevor E. Carlson, Reza Shokri, Shruti Tople

Abstract: ML-as-a-service is gaining popularity where a cloud server hosts a trained model and offers prediction (inference) service to users. In this setting, our objective is to protect the confidentiality of both the users' input queries as well as the model parameters at the server, with modest computation and communication overhead. Prior solutions primarily propose fine-tuning cryptographic methods to… ▽ More ML-as-a-service is gaining popularity where a cloud server hosts a trained model and offers prediction (inference) service to users. In this setting, our objective is to protect the confidentiality of both the users' input queries as well as the model parameters at the server, with modest computation and communication overhead. Prior solutions primarily propose fine-tuning cryptographic methods to make them efficient for known fixed model architectures. The drawback with this line of approach is that the model itself is never designed to operate with existing efficient cryptographic computations. We observe that the network architecture, internal functions, and parameters of a model, which are all chosen during training, significantly influence the computation and communication overhead of a cryptographic method, during inference. Based on this observation, we propose SOTERIA -- a training method to construct model architectures that are by-design efficient for private inference. We use neural architecture search algorithms with the dual objective of optimizing the accuracy of the model and the overhead of using cryptographic primitives for secure inference. Given the flexibility of modifying a model during training, we find accurate models that are also efficient for private computation. We select garbled circuits as our underlying cryptographic primitive, due to their expressiveness and efficiency, but this approach can be extended to hybrid multi-party computation settings. We empirically evaluate SOTERIA on MNIST and CIFAR10 datasets, to compare with the prior work. Our results confirm that SOTERIA is indeed effective in balancing performance and accuracy. △ Less

Submitted 25 July, 2020; originally announced July 2020.

arXiv:2007.11524 [pdf, ps, other]

Improving Deep Learning with Differential Privacy using Gradient Encoding and Denoising

Authors: Milad Nasr, Reza Shokri, Amir houmansadr

Abstract: Deep learning models leak significant amounts of information about their training datasets. Previous work has investigated training models with differential privacy (DP) guarantees through adding DP noise to the gradients. However, such solutions (specifically, DPSGD), result in large degradations in the accuracy of the trained models. In this paper, we aim at training deep learning models with DP… ▽ More Deep learning models leak significant amounts of information about their training datasets. Previous work has investigated training models with differential privacy (DP) guarantees through adding DP noise to the gradients. However, such solutions (specifically, DPSGD), result in large degradations in the accuracy of the trained models. In this paper, we aim at training deep learning models with DP guarantees while preserving model accuracy much better than previous works. Our key technique is to encode gradients to map them to a smaller vector space, therefore enabling us to obtain DP guarantees for different noise distributions. This allows us to investigate and choose noise distributions that best preserve model accuracy for a target privacy budget. We also take advantage of the post-processing property of differential privacy by introducing the idea of denoising, which further improves the utility of the trained models without degrading their DP guarantees. We show that our mechanism outperforms the state-of-the-art DPSGD; for instance, for the same model accuracy of $96.1\%$ on MNIST, our technique results in a privacy bound of $ε=3.2$ compared to $ε=6$ of DPSGD, which is a significant improvement. △ Less

Submitted 22 July, 2020; originally announced July 2020.

arXiv:2007.09339 [pdf, other]

ML Privacy Meter: Aiding Regulatory Compliance by Quantifying the Privacy Risks of Machine Learning

Authors: Sasi Kumar Murakonda, Reza Shokri

Abstract: When building machine learning models using sensitive data, organizations should ensure that the data processed in such systems is adequately protected. For projects involving machine learning on personal data, Article 35 of the GDPR mandates it to perform a Data Protection Impact Assessment (DPIA). In addition to the threats of illegitimate access to data through security breaches, machine learni… ▽ More When building machine learning models using sensitive data, organizations should ensure that the data processed in such systems is adequately protected. For projects involving machine learning on personal data, Article 35 of the GDPR mandates it to perform a Data Protection Impact Assessment (DPIA). In addition to the threats of illegitimate access to data through security breaches, machine learning models pose an additional privacy risk to the data by indirectly revealing about it through the model predictions and parameters. Guidances released by the Information Commissioner's Office (UK) and the National Institute of Standards and Technology (US) emphasize on the threat to data from models and recommend organizations to account for and estimate these risks to comply with data protection regulations. Hence, there is an immediate need for a tool that can quantify the privacy risk to data from models. In this paper, we focus on this indirect leakage about training data from machine learning models. We present ML Privacy Meter, a tool that can quantify the privacy risk to data from models through state of the art membership inference attack techniques. We discuss how this tool can help practitioners in compliance with data protection regulations, when deploying machine learning models. △ Less

Submitted 18 July, 2020; originally announced July 2020.

arXiv:2006.09129 [pdf, other]

Model Explanations with Differential Privacy

Authors: Neel Patel, Reza Shokri, Yair Zick

Abstract: Black-box machine learning models are used in critical decision-making domains, giving rise to several calls for more algorithmic transparency. The drawback is that model explanations can leak information about the training data and the explanation data used to generate them, thus undermining data privacy. To address this issue, we propose differentially private algorithms to construct feature-bas… ▽ More Black-box machine learning models are used in critical decision-making domains, giving rise to several calls for more algorithmic transparency. The drawback is that model explanations can leak information about the training data and the explanation data used to generate them, thus undermining data privacy. To address this issue, we propose differentially private algorithms to construct feature-based model explanations. We design an adaptive differentially private gradient descent algorithm, that finds the minimal privacy budget required to produce accurate explanations. It reduces the overall privacy loss on explanation data, by adaptively reusing past differentially private explanations. It also amplifies the privacy guarantees with respect to the training data. We evaluate the implications of differentially private models and our privacy mechanisms on the quality of model explanations. △ Less

Submitted 16 June, 2020; originally announced June 2020.

Comments: 33 pages, 9 figures

arXiv:2006.08669 [pdf, other]

On Adversarial Bias and the Robustness of Fair Machine Learning

Authors: Hongyan Chang, Ta Duy Nguyen, Sasi Kumar Murakonda, Ehsan Kazemi, Reza Shokri

Abstract: Optimizing prediction accuracy can come at the expense of fairness. Towards minimizing discrimination against a group, fair machine learning algorithms strive to equalize the behavior of a model across different groups, by imposing a fairness constraint on models. However, we show that giving the same importance to groups of different sizes and distributions, to counteract the effect of bias in tr… ▽ More Optimizing prediction accuracy can come at the expense of fairness. Towards minimizing discrimination against a group, fair machine learning algorithms strive to equalize the behavior of a model across different groups, by imposing a fairness constraint on models. However, we show that giving the same importance to groups of different sizes and distributions, to counteract the effect of bias in training data, can be in conflict with robustness. We analyze data poisoning attacks against group-based fair machine learning, with the focus on equalized odds. An adversary who can control sampling or labeling for a fraction of training data, can reduce the test accuracy significantly beyond what he can achieve on unconstrained models. Adversarial sampling and adversarial labeling attacks can also worsen the model's fairness gap on test data, even though the model satisfies the fairness constraint on training data. We analyze the robustness of fair machine learning through an empirical evaluation of attacks on multiple algorithms and benchmark datasets. △ Less

Submitted 15 June, 2020; originally announced June 2020.

arXiv:2004.13293 [pdf, other]

Epione: Lightweight Contact Tracing with Strong Privacy

Authors: Ni Trieu, Kareem Shehata, Prateek Saxena, Reza Shokri, Dawn Song

Abstract: Contact tracing is an essential tool in containing infectious diseases such as COVID-19. Many countries and research groups have launched or announced mobile apps to facilitate contact tracing by recording contacts between users with some privacy considerations. Most of the focus has been on using random tokens, which are exchanged during encounters and stored locally on users' phones. Prior syste… ▽ More Contact tracing is an essential tool in containing infectious diseases such as COVID-19. Many countries and research groups have launched or announced mobile apps to facilitate contact tracing by recording contacts between users with some privacy considerations. Most of the focus has been on using random tokens, which are exchanged during encounters and stored locally on users' phones. Prior systems allow users to search over released tokens in order to learn if they have recently been in the proximity of a user that has since been diagnosed with the disease. However, prior approaches do not provide end-to-end privacy in the collection and querying of tokens. In particular, these approaches are vulnerable to either linkage attacks by users using token metadata, linkage attacks by the server, or false reporting by users. In this work, we introduce Epione, a lightweight system for contact tracing with strong privacy protections. Epione alerts users directly if any of their contacts have been diagnosed with the disease, while protecting the privacy of users' contacts from both central services and other users, and provides protection against false reporting. As a key building block, we present a new cryptographic tool for secure two-party private set intersection cardinality (PSI-CA), which allows two parties, each holding a set of items, to learn the intersection size of two private sets without revealing intersection items. We specifically tailor it to the case of large-scale contact tracing where clients have small input sets and the server's database of tokens is much larger. △ Less

Submitted 2 May, 2020; v1 submitted 28 April, 2020; originally announced April 2020.

arXiv:1912.11279 [pdf, ps, other]

Cronus: Robust and Heterogeneous Collaborative Learning with Black-Box Knowledge Transfer

Authors: Hongyan Chang, Virat Shejwalkar, Reza Shokri, Amir Houmansadr

Abstract: Collaborative (federated) learning enables multiple parties to train a model without sharing their private data, but through repeated sharing of the parameters of their local models. Despite its advantages, this approach has many known privacy and security weaknesses and performance overhead, in addition to being limited only to models with homogeneous architectures. Shared parameters leak a signi… ▽ More Collaborative (federated) learning enables multiple parties to train a model without sharing their private data, but through repeated sharing of the parameters of their local models. Despite its advantages, this approach has many known privacy and security weaknesses and performance overhead, in addition to being limited only to models with homogeneous architectures. Shared parameters leak a significant amount of information about the local (and supposedly private) datasets. Besides, federated learning is severely vulnerable to poisoning attacks, where some participants can adversarially influence the aggregate parameters. Large models, with high dimensional parameter vectors, are in particular highly susceptible to privacy and security attacks: curse of dimensionality in federated learning. We argue that sharing parameters is the most naive way of information exchange in collaborative learning, as they open all the internal state of the model to inference attacks, and maximize the model's malleability by stealthy poisoning attacks. We propose Cronus, a robust collaborative machine learning framework. The simple yet effective idea behind designing Cronus is to control, unify, and significantly reduce the dimensions of the exchanged information between parties, through robust knowledge transfer between their black-box local models. We evaluate all existing federated learning algorithms against poisoning attacks, and we show that Cronus is the only secure method, due to its tight robustness guarantee. Treating local models as black-box, reduces the information leakage through models, and enables us using existing privacy-preserving algorithms that mitigate the risk of information leakage through the model's output (predictions). Cronus also has a significantly lower sample complexity, compared to federated learning, which does not bind its security to the number of participants. △ Less

Submitted 24 December, 2019; originally announced December 2019.

arXiv:1909.12982 [pdf, other]

Robust Membership Encoding: Inference Attacks and Copyright Protection for Deep Learning

Authors: Congzheng Song, Reza Shokri

Abstract: Machine learning as a service (MLaaS), and algorithm marketplaces are on a rise. Data holders can easily train complex models on their data using third party provided learning codes. Training accurate ML models requires massive labeled data and advanced learning algorithms. The resulting models are considered as intellectual property of the model owners and their copyright should be protected. Als… ▽ More Machine learning as a service (MLaaS), and algorithm marketplaces are on a rise. Data holders can easily train complex models on their data using third party provided learning codes. Training accurate ML models requires massive labeled data and advanced learning algorithms. The resulting models are considered as intellectual property of the model owners and their copyright should be protected. Also, MLaaS needs to be trusted not to embed secret information about the training data into the model, such that it could be later retrieved when the model is deployed. In this paper, we present \emph{membership encoding} for training deep neural networks and encoding the membership information, i.e. whether a data point is used for training, for a subset of training data. Membership encoding has several applications in different scenarios, including robust watermarking for model copyright protection, and also the risk analysis of stealthy data embedding privacy attacks. Our encoding algorithm can determine the membership of significantly redacted data points, and is also robust to model compression and fine-tuning. It also enables encoding a significant fraction of the training set, with negligible drop in the model's prediction accuracy. △ Less

Submitted 21 March, 2020; v1 submitted 27 September, 2019; originally announced September 2019.

arXiv:1907.00164 [pdf, other]

On the Privacy Risks of Model Explanations

Authors: Reza Shokri, Martin Strobel, Yair Zick

Abstract: Privacy and transparency are two key foundations of trustworthy machine learning. Model explanations offer insights into a model's decisions on input data, whereas privacy is primarily concerned with protecting information about the training data. We analyze connections between model explanations and the leakage of sensitive information about the model's training set. We investigate the privacy ri… ▽ More Privacy and transparency are two key foundations of trustworthy machine learning. Model explanations offer insights into a model's decisions on input data, whereas privacy is primarily concerned with protecting information about the training data. We analyze connections between model explanations and the leakage of sensitive information about the model's training set. We investigate the privacy risks of feature-based model explanations using membership inference attacks: quantifying how much model predictions plus their explanations leak information about the presence of a datapoint in the training set of a model. We extensively evaluate membership inference attacks based on feature-based model explanations, over a variety of datasets. We show that backpropagation-based explanations can leak a significant amount of information about individual training datapoints. This is because they reveal statistical information about the decision boundaries of the model about an input, which can reveal its membership. We also empirically investigate the trade-off between privacy and explanation quality, by studying the perturbation-based model explanations. △ Less

Submitted 5 February, 2021; v1 submitted 29 June, 2019; originally announced July 2019.

Comments: 19 pages, 13 figures

arXiv:1905.13409 [pdf, other]

Bypassing Backdoor Detection Algorithms in Deep Learning

Authors: Te Juin Lester Tan, Reza Shokri

Abstract: Deep learning models are vulnerable to various adversarial manipulations of their training data, parameters, and input sample. In particular, an adversary can modify the training data and model parameters to embed backdoors into the model, so the model behaves according to the adversary's objective if the input contains the backdoor features, referred to as the backdoor trigger (e.g., a stamp on a… ▽ More Deep learning models are vulnerable to various adversarial manipulations of their training data, parameters, and input sample. In particular, an adversary can modify the training data and model parameters to embed backdoors into the model, so the model behaves according to the adversary's objective if the input contains the backdoor features, referred to as the backdoor trigger (e.g., a stamp on an image). The poisoned model's behavior on clean data, however, remains unchanged. Many detection algorithms are designed to detect backdoors on input samples or model parameters, through the statistical difference between the latent representations of adversarial and clean input samples in the poisoned model. In this paper, we design an adversarial backdoor embedding algorithm that can bypass the existing detection algorithms including the state-of-the-art techniques. We design an adaptive adversarial training algorithm that optimizes the original loss function of the model, and also maximizes the indistinguishability of the hidden representations of poisoned data and clean data. This work calls for designing adversary-aware defense mechanisms for backdoor detection. △ Less

Submitted 6 June, 2020; v1 submitted 31 May, 2019; originally announced May 2019.

Comments: IEEE European Symposium on Security and Privacy 2020

arXiv:1905.12774 [pdf, ps, other]

Quantifying the Privacy Risks of Learning High-Dimensional Graphical Models

Authors: Sasi Kumar Murakonda, Reza Shokri, George Theodorakopoulos

Abstract: Models leak information about their training data. This enables attackers to infer sensitive information about their training sets, notably determine if a data sample was part of the model's training set. The existing works empirically show the possibility of these membership inference (tracing) attacks against complex deep learning models. However, the attack results are dependent on the specific… ▽ More Models leak information about their training data. This enables attackers to infer sensitive information about their training sets, notably determine if a data sample was part of the model's training set. The existing works empirically show the possibility of these membership inference (tracing) attacks against complex deep learning models. However, the attack results are dependent on the specific training data, can be obtained only after the tedious process of training the model and performing the attack, and are missing any measure of the confidence and unused potential power of the attack. In this paper, we theoretically analyze the maximum power of tracing attacks against high-dimensional graphical models, with the focus on Bayesian networks. We provide a tight upper bound on the power (true positive rate) of these attacks, with respect to their error (false positive rate), for a given model structure even before learning its parameters. As it should be, the bound is independent of the knowledge and algorithm of any specific attack. It can help in identifying which model structures leak more information, how adding new parameters to the model increases its privacy risk, and what can be gained by adding new data points to decrease the overall information leakage. It provides a measure of the potential leakage of a model given its structure, as a function of the model complexity and the size of the training set. △ Less

Submitted 17 February, 2021; v1 submitted 29 May, 2019; originally announced May 2019.

arXiv:1905.10291 [pdf, other]

doi 10.1145/3319535.3354211

Privacy Risks of Securing Machine Learning Models against Adversarial Examples

Authors: Liwei Song, Reza Shokri, Prateek Mittal

Abstract: The arms race between attacks and defenses for machine learning models has come to a forefront in recent years, in both the security community and the privacy community. However, one big limitation of previous research is that the security domain and the privacy domain have typically been considered separately. It is thus unclear whether the defense methods in one domain will have any unexpected i… ▽ More The arms race between attacks and defenses for machine learning models has come to a forefront in recent years, in both the security community and the privacy community. However, one big limitation of previous research is that the security domain and the privacy domain have typically been considered separately. It is thus unclear whether the defense methods in one domain will have any unexpected impact on the other domain. In this paper, we take a step towards resolving this limitation by combining the two domains. In particular, we measure the success of membership inference attacks against six state-of-the-art defense methods that mitigate the risk of adversarial examples (i.e., evasion attacks). Membership inference attacks determine whether or not an individual data record has been part of a model's training set. The accuracy of such attacks reflects the information leakage of training algorithms about individual members of the training set. Adversarial defense methods against adversarial examples influence the model's decision boundaries such that model predictions remain unchanged for a small area around each input. However, this objective is optimized on training data. Thus, individual data records in the training set have a significant influence on robust models. This makes the models more vulnerable to inference attacks. To perform the membership inference attacks, we leverage the existing inference methods that exploit model predictions. We also propose two new inference methods that exploit structural properties of robust models on adversarially perturbed data. Our experimental evaluation demonstrates that compared with the natural training (undefended) approach, adversarial defense methods can indeed increase the target model's risk against membership inference attacks. △ Less

Submitted 25 August, 2019; v1 submitted 24 May, 2019; originally announced May 2019.

Comments: ACM CCS 2019, code is available at https://github.com/inspire-group/privacy-vs-robustness

arXiv:1812.00910 [pdf, ps, other]

doi 10.1109/SP.2019.00065

Comprehensive Privacy Analysis of Deep Learning: Passive and Active White-box Inference Attacks against Centralized and Federated Learning

Authors: Milad Nasr, Reza Shokri, Amir Houmansadr

Abstract: Deep neural networks are susceptible to various inference attacks as they remember information about their training data. We design white-box inference attacks to perform a comprehensive privacy analysis of deep learning models. We measure the privacy leakage through parameters of fully trained models as well as the parameter updates of models during training. We design inference algorithms for bo… ▽ More Deep neural networks are susceptible to various inference attacks as they remember information about their training data. We design white-box inference attacks to perform a comprehensive privacy analysis of deep learning models. We measure the privacy leakage through parameters of fully trained models as well as the parameter updates of models during training. We design inference algorithms for both centralized and federated learning, with respect to passive and active inference attackers, and assuming different adversary prior knowledge. We evaluate our novel white-box membership inference attacks against deep learning algorithms to trace their training data records. We show that a straightforward extension of the known black-box attacks to the white-box setting (through analyzing the outputs of activation functions) is ineffective. We therefore design new algorithms tailored to the white-box setting by exploiting the privacy vulnerabilities of the stochastic gradient descent algorithm, which is the algorithm used to train deep neural networks. We investigate the reasons why deep learning models may leak information about their training data. We then show that even well-generalized models are significantly susceptible to white-box membership inference attacks, by analyzing state-of-the-art pre-trained and publicly available models for the CIFAR dataset. We also show how adversarial participants, in the federated learning setting, can successfully run active membership inference attacks against other participants, even when the global model achieves high prediction accuracies. △ Less

Submitted 6 June, 2020; v1 submitted 3 December, 2018; originally announced December 2018.

Comments: 2019 IEEE Symposium on Security and Privacy (SP)

arXiv:1807.05852 [pdf, ps, other]

Machine Learning with Membership Privacy using Adversarial Regularization

Authors: Milad Nasr, Reza Shokri, Amir Houmansadr

Abstract: Machine learning models leak information about the datasets on which they are trained. An adversary can build an algorithm to trace the individual members of a model's training dataset. As a fundamental inference attack, he aims to distinguish between data points that were part of the model's training set and any other data points from the same distribution. This is known as the tracing (and also… ▽ More Machine learning models leak information about the datasets on which they are trained. An adversary can build an algorithm to trace the individual members of a model's training dataset. As a fundamental inference attack, he aims to distinguish between data points that were part of the model's training set and any other data points from the same distribution. This is known as the tracing (and also membership inference) attack. In this paper, we focus on such attacks against black-box models, where the adversary can only observe the output of the model, but not its parameters. This is the current setting of machine learning as a service in the Internet. We introduce a privacy mechanism to train machine learning models that provably achieve membership privacy: the model's predictions on its training data are indistinguishable from its predictions on other data points from the same distribution. We design a strategic mechanism where the privacy mechanism anticipates the membership inference attacks. The objective is to train a model such that not only does it have the minimum prediction error (high utility), but also it is the most robust model against its corresponding strongest inference attack (high privacy). We formalize this as a min-max game optimization problem, and design an adversarial training algorithm that minimizes the classification loss of the model as well as the maximum gain of the membership inference attack against it. This strategy, which guarantees membership privacy (as prediction indistinguishability), acts also as a strong regularizer and significantly generalizes the model. We evaluate our privacy mechanism on deep neural networks using different benchmark datasets. We show that our min-max strategy can mitigate the risk of membership inference attacks (close to the random guess) with a negligible cost in terms of the classification error. △ Less

Submitted 16 July, 2018; originally announced July 2018.

arXiv:1803.05961 [pdf, other]

Chiron: Privacy-preserving Machine Learning as a Service

Authors: Tyler Hunt, Congzheng Song, Reza Shokri, Vitaly Shmatikov, Emmett Witchel

Abstract: Major cloud operators offer machine learning (ML) as a service, enabling customers who have the data but not ML expertise or infrastructure to train predictive models on this data. Existing ML-as-a-service platforms require users to reveal all training data to the service operator. We design, implement, and evaluate Chiron, a system for privacy-preserving machine learning as a service. First, Chir… ▽ More Major cloud operators offer machine learning (ML) as a service, enabling customers who have the data but not ML expertise or infrastructure to train predictive models on this data. Existing ML-as-a-service platforms require users to reveal all training data to the service operator. We design, implement, and evaluate Chiron, a system for privacy-preserving machine learning as a service. First, Chiron conceals the training data from the service operator. Second, in keeping with how many existing ML-as-a-service platforms work, Chiron reveals neither the training algorithm nor the model structure to the user, providing only black-box access to the trained model. Chiron is implemented using SGX enclaves, but SGX alone does not achieve the dual goals of data privacy and model confidentiality. Chiron runs the standard ML training toolchain (including the popular Theano framework and C compiler) in an enclave, but the untrusted model-creation code from the service operator is further confined in a Ryoan sandbox to prevent it from leaking the training data outside the enclave. To support distributed training, Chiron executes multiple concurrent enclaves that exchange model parameters via a parameter server. We evaluate Chiron on popular deep learning models, focusing on benchmark image classification tasks such as CIFAR and ImageNet, and show that its training performance and accuracy of the resulting models are practical for common uses of ML-as-a-service. △ Less

Submitted 15 March, 2018; originally announced March 2018.

arXiv:1708.07975 [pdf, ps, other]

Plausible Deniability for Privacy-Preserving Data Synthesis

Authors: Vincent Bindschaedler, Reza Shokri, Carl A. Gunter

Abstract: Releasing full data records is one of the most challenging problems in data privacy. On the one hand, many of the popular techniques such as data de-identification are problematic because of their dependence on the background knowledge of adversaries. On the other hand, rigorous methods such as the exponential mechanism for differential privacy are often computationally impractical to use for rele… ▽ More Releasing full data records is one of the most challenging problems in data privacy. On the one hand, many of the popular techniques such as data de-identification are problematic because of their dependence on the background knowledge of adversaries. On the other hand, rigorous methods such as the exponential mechanism for differential privacy are often computationally impractical to use for releasing high dimensional data or cannot preserve high utility of original data due to their extensive data perturbation. This paper presents a criterion called plausible deniability that provides a formal privacy guarantee, notably for releasing sensitive datasets: an output record can be released only if a certain amount of input records are indistinguishable, up to a privacy parameter. This notion does not depend on the background knowledge of an adversary. Also, it can efficiently be checked by privacy tests. We present mechanisms to generate synthetic datasets with similar statistical properties to the input data and the same format. We study this technique both theoretically and experimentally. A key theoretical result shows that, with proper randomization, the plausible deniability mechanism generates differentially private synthetic data. We demonstrate the efficiency of this generative technique on a large dataset; it is shown to preserve the utility of original data with respect to various statistical analysis and machine learning measures. △ Less

Submitted 26 August, 2017; originally announced August 2017.

Comments: In PVLDB 2017

arXiv:1610.05820 [pdf, other]

Membership Inference Attacks against Machine Learning Models

Authors: Reza Shokri, Marco Stronati, Congzheng Song, Vitaly Shmatikov

Abstract: We quantitatively investigate how machine learning models leak information about the individual data records on which they were trained. We focus on the basic membership inference attack: given a data record and black-box access to a model, determine if the record was in the model's training dataset. To perform membership inference against a target model, we make adversarial use of machine learnin… ▽ More We quantitatively investigate how machine learning models leak information about the individual data records on which they were trained. We focus on the basic membership inference attack: given a data record and black-box access to a model, determine if the record was in the model's training dataset. To perform membership inference against a target model, we make adversarial use of machine learning and train our own inference model to recognize differences in the target model's predictions on the inputs that it trained on versus the inputs that it did not train on. We empirically evaluate our inference techniques on classification models trained by commercial "machine learning as a service" providers such as Google and Amazon. Using realistic datasets and classification tasks, including a hospital discharge dataset whose membership is sensitive from the privacy perspective, we show that these models can be vulnerable to membership inference attacks. We then investigate the factors that influence this leakage and evaluate mitigation strategies. △ Less

Submitted 31 March, 2017; v1 submitted 18 October, 2016; originally announced October 2016.

Comments: In the proceedings of the IEEE Symposium on Security and Privacy, 2017

arXiv:1609.00408 [pdf, other]

Defeating Image Obfuscation with Deep Learning

Authors: Richard McPherson, Reza Shokri, Vitaly Shmatikov

Abstract: We demonstrate that modern image recognition methods based on artificial neural networks can recover hidden information from images protected by various forms of obfuscation. The obfuscation techniques considered in this paper are mosaicing (also known as pixelation), blurring (as used by YouTube), and P3, a recently proposed system for privacy-preserving photo sharing that encrypts the significan… ▽ More We demonstrate that modern image recognition methods based on artificial neural networks can recover hidden information from images protected by various forms of obfuscation. The obfuscation techniques considered in this paper are mosaicing (also known as pixelation), blurring (as used by YouTube), and P3, a recently proposed system for privacy-preserving photo sharing that encrypts the significant JPEG coefficients to make images unrecognizable by humans. We empirically show how to train artificial neural networks to successfully identify faces and recognize objects and handwritten digits even if the images are protected using any of the above obfuscation techniques. △ Less

Submitted 6 September, 2016; v1 submitted 1 September, 2016; originally announced September 2016.

arXiv:1505.07499 [pdf, other]

Privacy through Fake yet Semantically Real Traces

Authors: Vincent Bindschaedler, Reza Shokri

Abstract: Camouflaging data by generating fake information is a well-known obfuscation technique for protecting data privacy. In this paper, we focus on a very sensitive and increasingly exposed type of data: location data. There are two main scenarios in which fake traces are of extreme value to preserve location privacy: publishing datasets of location trajectories, and using location-based services. Desp… ▽ More Camouflaging data by generating fake information is a well-known obfuscation technique for protecting data privacy. In this paper, we focus on a very sensitive and increasingly exposed type of data: location data. There are two main scenarios in which fake traces are of extreme value to preserve location privacy: publishing datasets of location trajectories, and using location-based services. Despite advances in protecting (location) data privacy, there is no quantitative method to evaluate how realistic a synthetic trace is, and how much utility and privacy it provides in each scenario. Also, the lack of a methodology to generate privacy-preserving fake traces is evident. In this paper, we fill this gap and propose the first statistical metric and model to generate fake location traces such that both the utility of data and the privacy of users are preserved. We build upon the fact that, although geographically they visit distinct locations, people have strongly semantically similar mobility patterns, for example, their transition pattern across activities (e.g., working, driving, staying at home) is similar. We define a statistical metric and propose an algorithm that automatically discovers the hidden semantic similarities between locations from a bag of real location traces as seeds, without requiring any initial semantic annotations. We guarantee that fake traces are geographically dissimilar to their seeds, so they do not leak sensitive location information. We also protect contributors to seed traces against membership attacks. Interleaving fake traces with mobile users' traces is a prominent location privacy defense mechanism. We quantitatively show the effectiveness of our methodology in protecting against localization inference attacks while preserving utility of sharing/publishing traces. △ Less

Submitted 27 May, 2015; originally announced May 2015.

arXiv:1409.1716 [pdf, other]

Prolonging the Hide-and-Seek Game: Optimal Trajectory Privacy for Location-Based Services

Authors: George Theodorakopoulos, Reza Shokri, Carmela Troncoso, Jean-Pierre Hubaux, Jean-Yves Le Boudec

Abstract: Human mobility is highly predictable. Individuals tend to only visit a few locations with high frequency, and to move among them in a certain sequence reflecting their habits and daily routine. This predictability has to be taken into account in the design of location privacy preserving mechanisms (LPPMs) in order to effectively protect users when they continuously expose their position to locatio… ▽ More Human mobility is highly predictable. Individuals tend to only visit a few locations with high frequency, and to move among them in a certain sequence reflecting their habits and daily routine. This predictability has to be taken into account in the design of location privacy preserving mechanisms (LPPMs) in order to effectively protect users when they continuously expose their position to location-based services (LBSs). In this paper, we describe a method for creating LPPMs that are customized for a user's mobility profile taking into account privacy and quality of service requirements. By construction, our LPPMs take into account the sequential correlation across the user's exposed locations, providing the maximum possible trajectory privacy, i.e., privacy for the user's present location, as well as past and expected future locations. Moreover, our LPPMs are optimal against a strategic adversary, i.e., an attacker that implements the strongest inference attack knowing both the LPPM operation and the user's mobility profile. The optimality of the LPPMs in the context of trajectory privacy is a novel contribution, and it is achieved by formulating the LPPM design problem as a Bayesian Stackelberg game between the user and the adversary. An additional benefit of our formal approach is that the design parameters of the LPPM are chosen by the optimization algorithm. △ Less

Submitted 5 September, 2014; originally announced September 2014.

Comments: Workshop on Privacy in the Electronic Society (WPES 2014)

ACM Class: C.2.0

arXiv:1402.3426 [pdf, other]

doi 10.1515/popets-2015-0024

Privacy Games: Optimal User-Centric Data Obfuscation

Authors: Reza Shokri

Abstract: In this paper, we design user-centric obfuscation mechanisms that impose the minimum utility loss for guaranteeing user's privacy. We optimize utility subject to a joint guarantee of differential privacy (indistinguishability) and distortion privacy (inference error). This double shield of protection limits the information leakage through obfuscation mechanism as well as the posterior inference. W… ▽ More In this paper, we design user-centric obfuscation mechanisms that impose the minimum utility loss for guaranteeing user's privacy. We optimize utility subject to a joint guarantee of differential privacy (indistinguishability) and distortion privacy (inference error). This double shield of protection limits the information leakage through obfuscation mechanism as well as the posterior inference. We show that the privacy achieved through joint differential-distortion mechanisms against optimal attacks is as large as the maximum privacy that can be achieved by either of these mechanisms separately. Their utility cost is also not larger than what either of the differential or distortion mechanisms imposes. We model the optimization problem as a leader-follower game between the designer of obfuscation mechanism and the potential adversary, and design adaptive mechanisms that anticipate and protect against optimal inference algorithms. Thus, the obfuscation mechanism is optimal against any inference algorithm. △ Less

Submitted 27 May, 2015; v1 submitted 14 February, 2014; originally announced February 2014.

Showing 1–41 of 41 results for author: Shokri, R