Search | arXiv e-print repository

De-amplifying Bias from Differential Privacy in Language Model Fine-tuning

Authors: Sanjari Srivastava, Piotr Mardziel, Zhikhun Zhang, Archana Ahlawat, Anupam Datta, John C Mitchell

Abstract: Fairness and privacy are two important values machine learning (ML) practitioners often seek to operationalize in models. Fairness aims to reduce model bias for social/demographic sub-groups. Privacy via differential privacy (DP) mechanisms, on the other hand, limits the impact of any individual's training data on the resulting model. The trade-offs between privacy and fairness goals of trustworth… ▽ More Fairness and privacy are two important values machine learning (ML) practitioners often seek to operationalize in models. Fairness aims to reduce model bias for social/demographic sub-groups. Privacy via differential privacy (DP) mechanisms, on the other hand, limits the impact of any individual's training data on the resulting model. The trade-offs between privacy and fairness goals of trustworthy ML pose a challenge to those wishing to address both. We show that DP amplifies gender, racial, and religious bias when fine-tuning large language models (LLMs), producing models more biased than ones fine-tuned without DP. We find the cause of the amplification to be a disparity in convergence of gradients across sub-groups. Through the case of binary gender bias, we demonstrate that Counterfactual Data Augmentation (CDA), a known method for addressing bias, also mitigates bias amplification by DP. As a consequence, DP and CDA together can be used to fine-tune models while maintaining both fairness and privacy. △ Less

Submitted 6 February, 2024; originally announced February 2024.

arXiv:2011.00740 [pdf, other]

Influence Patterns for Explaining Information Flow in BERT

Authors: Kaiji Lu, Zifan Wang, Piotr Mardziel, Anupam Datta

Abstract: While attention is all you need may be proving true, we do not know why: attention-based transformer models such as BERT are superior but how information flows from input tokens to output predictions are unclear. We introduce influence patterns, abstractions of sets of paths through a transformer model. Patterns quantify and localize the flow of information to paths passing through a sequence of m… ▽ More While attention is all you need may be proving true, we do not know why: attention-based transformer models such as BERT are superior but how information flows from input tokens to output predictions are unclear. We introduce influence patterns, abstractions of sets of paths through a transformer model. Patterns quantify and localize the flow of information to paths passing through a sequence of model nodes. Experimentally, we find that significant portion of information flow in BERT goes through skip connections instead of attention heads. We further show that consistency of patterns across instances is an indicator of BERT's performance. Finally, We demonstrate that patterns account for far more model performance than previous attention-based and layer-based methods. △ Less

Submitted 30 November, 2021; v1 submitted 1 November, 2020; originally announced November 2020.

Comments: Neurips 2021

arXiv:2009.08507 [pdf, other]

Reconstructing Actions To Explain Deep Reinforcement Learning

Authors: Xuan Chen, Zifan Wang, Yucai Fan, Bonan Jin, Piotr Mardziel, Carlee Joe-Wong, Anupam Datta

Abstract: Feature attribution has been a foundational building block for explaining the input feature importance in supervised learning with Deep Neural Network (DNNs), but face new challenges when applied to deep Reinforcement Learning (RL).We propose a new approach to explaining deep RL actions by defining a class of \emph{action reconstruction} functions that mimic the behavior of a network in deep RL. T… ▽ More Feature attribution has been a foundational building block for explaining the input feature importance in supervised learning with Deep Neural Network (DNNs), but face new challenges when applied to deep Reinforcement Learning (RL).We propose a new approach to explaining deep RL actions by defining a class of \emph{action reconstruction} functions that mimic the behavior of a network in deep RL. This approach allows us to answer more complex explainability questions than direct application of DNN attribution methods, which we adapt to \emph{behavior-level attributions} in building our action reconstructions. It also allows us to define \emph{agreement}, a metric for quantitatively evaluating the explainability of our methods. Our experiments on a variety of Atari games suggest that perturbation-based attribution methods are significantly more suitable in reconstructing actions to explain the deep RL agent than alternative attribution methods, and show greater \emph{agreement} than existing explainability work utilizing attention. We further show that action reconstruction allows us to demonstrate how a deep agent learns to play Pac-Man game. △ Less

Submitted 12 February, 2021; v1 submitted 17 September, 2020; originally announced September 2020.

arXiv:2006.07986 [pdf, other]

Fairness Under Feature Exemptions: Counterfactual and Observational Measures

Authors: Sanghamitra Dutta, Praveen Venkatesh, Piotr Mardziel, Anupam Datta, Pulkit Grover

Abstract: With the growing use of ML in highly consequential domains, quantifying disparity with respect to protected attributes, e.g., gender, race, etc., is important. While quantifying disparity is essential, sometimes the needs of an occupation may require the use of certain features that are critical in a way that any disparity that can be explained by them might need to be exempted. E.g., in hiring a… ▽ More With the growing use of ML in highly consequential domains, quantifying disparity with respect to protected attributes, e.g., gender, race, etc., is important. While quantifying disparity is essential, sometimes the needs of an occupation may require the use of certain features that are critical in a way that any disparity that can be explained by them might need to be exempted. E.g., in hiring a software engineer for a safety-critical application, coding-skills may be weighed strongly, whereas name, zip code, or reference letters may be used only to the extent that they do not add disparity. In this work, we propose an information-theoretic decomposition of the total disparity (a quantification inspired from counterfactual fairness) into two components: a non-exempt component which quantifies the part that cannot be accounted for by the critical features, and an exempt component that quantifies the remaining disparity. This decomposition allows one to check if the disparity arose purely due to the critical features (inspired from the business necessity defense of disparate impact law) and also enables selective removal of the non-exempt component if desired. We arrive at this decomposition through canonical examples that lead to a set of desirable properties (axioms) that a measure of non-exempt disparity should satisfy. Our proposed measure satisfies all of them. Our quantification bridges ideas of causality, Simpson's paradox, and a body of work from information theory called Partial Information Decomposition. We also obtain an impossibility result showing that no observational measure can satisfy all the desirable properties, leading us to relax our goals and examine observational measures that satisfy only some of them. We perform case studies to show how one can audit/train models while reducing non-exempt disparity. △ Less

Submitted 6 August, 2021; v1 submitted 14 June, 2020; originally announced June 2020.

Comments: Accepted at the IEEE Transactions on Information Theory (Shorter version at AAAI 2020 as an oral presentation)

arXiv:2006.06643 [pdf, other]

Smoothed Geometry for Robust Attribution

Authors: Zifan Wang, Haofan Wang, Shakul Ramkumar, Matt Fredrikson, Piotr Mardziel, Anupam Datta

Abstract: Feature attributions are a popular tool for explaining the behavior of Deep Neural Networks (DNNs), but have recently been shown to be vulnerable to attacks that produce divergent explanations for nearby inputs. This lack of robustness is especially problematic in high-stakes applications where adversarially-manipulated explanations could impair safety and trustworthiness. Building on a geometric… ▽ More Feature attributions are a popular tool for explaining the behavior of Deep Neural Networks (DNNs), but have recently been shown to be vulnerable to attacks that produce divergent explanations for nearby inputs. This lack of robustness is especially problematic in high-stakes applications where adversarially-manipulated explanations could impair safety and trustworthiness. Building on a geometric understanding of these attacks presented in recent work, we identify Lipschitz continuity conditions on models' gradient that lead to robust gradient-based attributions, and observe that smoothness may also be related to the ability of an attack to transfer across multiple attribution methods. To mitigate these attacks in practice, we propose an inexpensive regularization method that promotes these conditions in DNNs, as well as a stochastic smoothing technique that does not require re-training. Our experiments on a range of image models demonstrate that both of these mitigations consistently improve attribution robustness, and confirm the role that smooth geometry plays in these attacks on real, large-scale models. △ Less

Submitted 22 October, 2020; v1 submitted 11 June, 2020; originally announced June 2020.

arXiv:2005.01190 [pdf, other]

Influence Paths for Characterizing Subject-Verb Number Agreement in LSTM Language Models

Authors: Kaiji Lu, Piotr Mardziel, Klas Leino, Matt Fedrikson, Anupam Datta

Abstract: LSTM-based recurrent neural networks are the state-of-the-art for many natural language processing (NLP) tasks. Despite their performance, it is unclear whether, or how, LSTMs learn structural features of natural languages such as subject-verb number agreement in English. Lacking this understanding, the generality of LSTM performance on this task and their suitability for related tasks remains unc… ▽ More LSTM-based recurrent neural networks are the state-of-the-art for many natural language processing (NLP) tasks. Despite their performance, it is unclear whether, or how, LSTMs learn structural features of natural languages such as subject-verb number agreement in English. Lacking this understanding, the generality of LSTM performance on this task and their suitability for related tasks remains uncertain. Further, errors cannot be properly attributed to a lack of structural capability, training data omissions, or other exceptional faults. We introduce *influence paths*, a causal account of structural properties as carried by paths across gates and neurons of a recurrent neural network. The approach refines the notion of influence (the subject's grammatical number has influence on the grammatical number of the subsequent verb) into a set of gate or neuron-level paths. The set localizes and segments the concept (e.g., subject-verb agreement), its constituent elements (e.g., the subject), and related or interfering elements (e.g., attractors). We exemplify the methodology on a widely-studied multi-layer LSTM language model, demonstrating its accounting for subject-verb number agreement. The results offer both a finer and a more complete view of an LSTM's handling of this structural aspect of the English language than prior results based on diagnostic classifiers and ablation. △ Less

Submitted 3 May, 2020; originally announced May 2020.

Comments: ACL 2020

arXiv:2002.07985 [pdf, other]

Interpreting Interpretations: Organizing Attribution Methods by Criteria

Authors: Zifan Wang, Piotr Mardziel, Anupam Datta, Matt Fredrikson

Abstract: Motivated by distinct, though related, criteria, a growing number of attribution methods have been developed tointerprete deep learning. While each relies on the interpretability of the concept of "importance" and our ability to visualize patterns, explanations produced by the methods often differ. As a result, input attribution for vision models fail to provide any level of human understanding of… ▽ More Motivated by distinct, though related, criteria, a growing number of attribution methods have been developed tointerprete deep learning. While each relies on the interpretability of the concept of "importance" and our ability to visualize patterns, explanations produced by the methods often differ. As a result, input attribution for vision models fail to provide any level of human understanding of model behaviour. In this work we expand the foundationsof human-understandable concepts with which attributionscan be interpreted beyond "importance" and its visualization; we incorporate the logical concepts of necessity andsufficiency, and the concept of proportionality. We definemetrics to represent these concepts as quantitative aspectsof an attribution. This allows us to compare attributionsproduced by different methods and interpret them in novelways: to what extent does this attribution (or this method)represent the necessity or sufficiency of the highlighted inputs, and to what extent is it proportional? We evaluate our measures on a collection of methods explaining convolutional neural networks (CNN) for image classification. We conclude that some attribution methods are more appropriate for interpretation in terms of necessity while others are in terms of sufficiency, while no method is always the most appropriate in terms of both. △ Less

Submitted 4 April, 2020; v1 submitted 18 February, 2020; originally announced February 2020.

arXiv:1910.01279 [pdf, other]

Score-CAM: Score-Weighted Visual Explanations for Convolutional Neural Networks

Authors: Haofan Wang, Zifan Wang, Mengnan Du, Fan Yang, Zijian Zhang, Sirui Ding, Piotr Mardziel, Xia Hu

Abstract: Recently, increasing attention has been drawn to the internal mechanisms of convolutional neural networks, and the reason why the network makes specific decisions. In this paper, we develop a novel post-hoc visual explanation method called Score-CAM based on class activation mapping. Unlike previous class activation mapping based approaches, Score-CAM gets rid of the dependence on gradients by obt… ▽ More Recently, increasing attention has been drawn to the internal mechanisms of convolutional neural networks, and the reason why the network makes specific decisions. In this paper, we develop a novel post-hoc visual explanation method called Score-CAM based on class activation mapping. Unlike previous class activation mapping based approaches, Score-CAM gets rid of the dependence on gradients by obtaining the weight of each activation map through its forward passing score on target class, the final result is obtained by a linear combination of weights and activation maps. We demonstrate that Score-CAM achieves better visual performance and fairness for interpreting the decision making process. Our approach outperforms previous methods on both recognition and localization tasks, it also passes the sanity check. We also indicate its application as debugging tools. Official code has been released. △ Less

Submitted 13 April, 2020; v1 submitted 2 October, 2019; originally announced October 2019.

Comments: Accepted to CVPR 2020: Workshop on Fair, Data Efficient and Trusted Computer Vision

arXiv:1907.01679 [pdf, other]

Build It, Break It, Fix It: Contesting Secure Development

Authors: James Parker, Michael Hicks, Andrew Ruef, Michelle L. Mazurek, Dave Levin, Daniel Votipka, Piotr Mardziel, Kelsey R. Fulton

Abstract: Typical security contests focus on breaking or mitigating the impact of buggy systems. We present the Build-it, Break-it, Fix-it (BIBIFI) contest, which aims to assess the ability to securely build software, not just break it. In BIBIFI, teams build specified software with the goal of maximizing correctness, performance, and security. The latter is tested when teams attempt to break other teams' s… ▽ More Typical security contests focus on breaking or mitigating the impact of buggy systems. We present the Build-it, Break-it, Fix-it (BIBIFI) contest, which aims to assess the ability to securely build software, not just break it. In BIBIFI, teams build specified software with the goal of maximizing correctness, performance, and security. The latter is tested when teams attempt to break other teams' submissions. Winners are chosen from among the best builders and the best breakers. BIBIFI was designed to be open-ended; teams can use any language, tool, process, etc. that they like. As such, contest outcomes shed light on factors that correlate with successfully building secure software and breaking insecure software. We ran three contests involving a total of 156 teams and three different programming problems. Quantitative analysis from these contests found that the most efficient build-it submissions used C/C++, but submissions coded in a statically-type safe language were 11 times less likely to have a security flaw than C/C++ submissions. Break-it teams that were also successful build-it teams were significantly better at finding security bugs. △ Less

Submitted 2 July, 2019; originally announced July 2019.

Comments: 35pgs. Extension of arXiv:1606.01881 which was a conference paper previously published in CCS 2016. This is a journal version submitted to TOPS

arXiv:1807.11714 [pdf, other]

Gender Bias in Neural Natural Language Processing

Authors: Kaiji Lu, Piotr Mardziel, Fangjing Wu, Preetam Amancharla, Anupam Datta

Abstract: We examine whether neural natural language processing (NLP) systems reflect historical biases in training data. We define a general benchmark to quantify gender bias in a variety of neural NLP tasks. Our empirical evaluation with state-of-the-art neural coreference resolution and textbook RNN-based language models trained on benchmark datasets finds significant gender bias in how models view occup… ▽ More We examine whether neural natural language processing (NLP) systems reflect historical biases in training data. We define a general benchmark to quantify gender bias in a variety of neural NLP tasks. Our empirical evaluation with state-of-the-art neural coreference resolution and textbook RNN-based language models trained on benchmark datasets finds significant gender bias in how models view occupations. We then mitigate bias with CDA: a generic methodology for corpus augmentation via causal interventions that breaks associations between gendered and gender-neutral words. We empirically show that CDA effectively decreases gender bias while preserving accuracy. We also explore the space of mitigation strategies with CDA, a prior approach to word embedding debiasing (WED), and their compositions. We show that CDA outperforms WED, drastically so when word embeddings are trained. For pre-trained embeddings, the two methods can be effectively composed. We also find that as training proceeds on the original data set with gradient descent the gender bias grows as the loss reduces, indicating that the optimization encourages bias; CDA mitigates this behavior. △ Less

Submitted 30 May, 2019; v1 submitted 31 July, 2018; originally announced July 2018.

arXiv:1803.10815 [pdf, other]

Supervising Feature Influence

Authors: Shayak Sen, Piotr Mardziel, Anupam Datta, Matthew Fredrikson

Abstract: Causal influence measures for machine learnt classifiers shed light on the reasons behind classification, and aid in identifying influential input features and revealing their biases. However, such analyses involve evaluating the classifier using datapoints that may be atypical of its training distribution. Standard methods for training classifiers that minimize empirical risk do not constrain the… ▽ More Causal influence measures for machine learnt classifiers shed light on the reasons behind classification, and aid in identifying influential input features and revealing their biases. However, such analyses involve evaluating the classifier using datapoints that may be atypical of its training distribution. Standard methods for training classifiers that minimize empirical risk do not constrain the behavior of the classifier on such datapoints. As a result, training to minimize empirical risk does not distinguish among classifiers that agree on predictions in the training distribution but have wildly different causal influences. We term this problem covariate shift in causal testing and formally characterize conditions under which it arises. As a solution to this problem, we propose a novel active learning algorithm that constrains the influence measures of the trained model. We prove that any two predictors whose errors are close on both the original training distribution and the distribution of atypical points are guaranteed to have causal influences that are also close. Further, we empirically demonstrate with synthetic labelers that our algorithm trains models that (i) have similar causal influences as the labeler's model, and (ii) generalize better to out-of-distribution points while (iii) retaining their accuracy on in-distribution points. △ Less

Submitted 7 April, 2018; v1 submitted 28 March, 2018; originally announced March 2018.

arXiv:1802.08927 [pdf, other]

doi 10.1007/978-3-319-89884-1_23

Evaluating Design Tradeoffs in Numeric Static Analysis for Java

Authors: Shiyi Wei, Piotr Mardziel, Andrew Ruef, Jeffrey S. Foster, Michael Hicks

Abstract: Numeric static analysis for Java has a broad range of potentially useful applications, including array bounds checking and resource usage estimation. However, designing a scalable numeric static analysis for real-world Java programs presents a multitude of design choices, each of which may interact with others. For example, an analysis could handle method calls via either a top-down or bottom-up i… ▽ More Numeric static analysis for Java has a broad range of potentially useful applications, including array bounds checking and resource usage estimation. However, designing a scalable numeric static analysis for real-world Java programs presents a multitude of design choices, each of which may interact with others. For example, an analysis could handle method calls via either a top-down or bottom-up interprocedural analysis. Moreover, this choice could interact with how we choose to represent aliasing in the heap and/or whether we use a relational numeric domain, e.g., convex polyhedra. In this paper, we present a family of abstract interpretation-based numeric static analyses for Java and systematically evaluate the impact of 162 analysis configurations on the DaCapo benchmark suite. Our experiment considered the precision and performance of the analyses for discharging array bounds checks. We found that top-down analysis is generally a better choice than bottom-up analysis, and that using access paths to describe heap objects is better than using summary objects corresponding to points-to analysis locations. Moreover, these two choices are the most significant, while choices about the numeric domain, representation of abstract objects, and context-sensitivity make much less difference to the precision/performance tradeoff. △ Less

Submitted 24 February, 2018; originally announced February 2018.

arXiv:1711.10816 [pdf, other]

Latent Factor Interpretations for Collaborative Filtering

Authors: Anupam Datta, Sophia Kovaleva, Piotr Mardziel, Shayak Sen

Abstract: Many machine learning systems utilize latent factors as internal representations for making predictions. Since these latent factors are largely uninterpreted, however, predictions made using them are opaque. Collaborative filtering via matrix factorization is a prime example of such an algorithm that uses uninterpreted latent features, and yet has seen widespread adoption for many recommendation t… ▽ More Many machine learning systems utilize latent factors as internal representations for making predictions. Since these latent factors are largely uninterpreted, however, predictions made using them are opaque. Collaborative filtering via matrix factorization is a prime example of such an algorithm that uses uninterpreted latent features, and yet has seen widespread adoption for many recommendation tasks. We present Latent Factor Interpretation (LFI), a method for interpreting models by leveraging interpretations of latent factors in terms of human-understandable features. The interpretation of latent factors can then replace the uninterpreted latent factors, resulting in a new model that expresses predictions in terms of interpretable features. This new model can then be interpreted using recently developed model explanation techniques. In this paper we develop LFI for collaborative filtering based recommender systems. We illustrate the use of LFI interpretations on the MovieLens dataset, integrating auxiliary features from IMDB and DB tropes, and show that latent factors can be predicted with sufficient accuracy for replicating the predictions of the true model. △ Less

Submitted 9 April, 2018; v1 submitted 29 November, 2017; originally announced November 2017.

arXiv:1707.08120 [pdf, other]

Proxy Non-Discrimination in Data-Driven Systems

Authors: Anupam Datta, Matt Fredrikson, Gihyuk Ko, Piotr Mardziel, Shayak Sen

Abstract: Machine learnt systems inherit biases against protected classes, historically disparaged groups, from training data. Usually, these biases are not explicit, they rely on subtle correlations discovered by training algorithms, and are therefore difficult to detect. We formalize proxy discrimination in data-driven systems, a class of properties indicative of bias, as the presence of protected class c… ▽ More Machine learnt systems inherit biases against protected classes, historically disparaged groups, from training data. Usually, these biases are not explicit, they rely on subtle correlations discovered by training algorithms, and are therefore difficult to detect. We formalize proxy discrimination in data-driven systems, a class of properties indicative of bias, as the presence of protected class correlates that have causal influence on the system's output. We evaluate an implementation on a corpus of social datasets, demonstrating how to validate systems against these properties and to repair violations where they occur. △ Less

Submitted 25 July, 2017; originally announced July 2017.

Comments: arXiv admin note: substantial text overlap with arXiv:1705.07807

arXiv:1705.07807 [pdf, other]

Use Privacy in Data-Driven Systems: Theory and Experiments with Machine Learnt Programs

Authors: Anupam Datta, Matthew Fredrikson, Gihyuk Ko, Piotr Mardziel, Shayak Sen

Abstract: This paper presents an approach to formalizing and enforcing a class of use privacy properties in data-driven systems. In contrast to prior work, we focus on use restrictions on proxies (i.e. strong predictors) of protected information types. Our definition relates proxy use to intermediate computations that occur in a program, and identify two essential properties that characterize this behavior:… ▽ More This paper presents an approach to formalizing and enforcing a class of use privacy properties in data-driven systems. In contrast to prior work, we focus on use restrictions on proxies (i.e. strong predictors) of protected information types. Our definition relates proxy use to intermediate computations that occur in a program, and identify two essential properties that characterize this behavior: 1) its result is strongly associated with the protected information type in question, and 2) it is likely to causally affect the final output of the program. For a specific instantiation of this definition, we present a program analysis technique that detects instances of proxy use in a model, and provides a witness that identifies which parts of the corresponding program exhibit the behavior. Recognizing that not all instances of proxy use of a protected information type are inappropriate, we make use of a normative judgment oracle that makes this inappropriateness determination for a given witness. Our repair algorithm uses the witness of an inappropriate proxy use to transform the model into one that provably does not exhibit proxy use, while avoiding changes that unduly affect classification accuracy. Using a corpus of social datasets, our evaluation shows that these algorithms are able to detect proxy use instances that would be difficult to find using existing techniques, and subsequently remove them while maintaining acceptable classification performance. △ Less

Submitted 7 September, 2017; v1 submitted 22 May, 2017; originally announced May 2017.

Comments: extended CCS 2017 camera-ready: several new discussions, and complexity results added to appendix

arXiv:1701.04174 [pdf, other]

Quantifying vulnerability of secret generation using hyper-distributions (extended version)

Authors: Mário S. Alvim, Piotr Mardziel, Michael Hicks

Abstract: Traditional approaches to Quantitative Information Flow (QIF) represent the adversary's prior knowledge of possible secret values as a single probability distribution. This representation may miss important structure. For instance, representing prior knowledge about passwords of a system's users in this way overlooks the fact that many users generate passwords using some strategy. Knowledge of suc… ▽ More Traditional approaches to Quantitative Information Flow (QIF) represent the adversary's prior knowledge of possible secret values as a single probability distribution. This representation may miss important structure. For instance, representing prior knowledge about passwords of a system's users in this way overlooks the fact that many users generate passwords using some strategy. Knowledge of such strategies can help the adversary in guessing a secret, so ignoring them may underestimate the secret's vulnerability. In this paper we explicitly model strategies as distributions on secrets, and generalize the representation of the adversary's prior knowledge from a distribution on secrets to an environment, which is a distribution on strategies (and, thus, a distribution on distributions on secrets, called a hyper-distribution). By applying information-theoretic techniques to environments we derive several meaningful generalizations of the traditional approach to QIF. In particular, we disentangle the vulnerability of a secret from the vulnerability of the strategies that generate secrets, and thereby distinguish security by aggregation--which relies on the uncertainty over strategies--from security by strategy--which relies on the intrinsic uncertainty within a strategy. We also demonstrate that, in a precise way, no further generalization of prior knowledge (e.g., by using distributions of even higher order) is needed to soundly quantify the vulnerability of the secret. △ Less

Submitted 21 January, 2017; v1 submitted 16 January, 2017; originally announced January 2017.

arXiv:1606.01881 [pdf, other]

doi 10.1145/2976749.2978382

Build It, Break It, Fix It: Contesting Secure Development

Authors: Andrew Ruef, Michael Hicks, James Parker, Dave Levin, Michelle L. Mazurek, Piotr Mardziel

Abstract: Typical security contests focus on breaking or mitigating the impact of buggy systems. We present the Build-it Break-it Fix-it BIBIFI contest which aims to assess the ability to securely build software not just break it. In BIBIFI teams build specified software with the goal of maximizing correctness performance and security. The latter is tested when teams attempt to break other teams submissions… ▽ More Typical security contests focus on breaking or mitigating the impact of buggy systems. We present the Build-it Break-it Fix-it BIBIFI contest which aims to assess the ability to securely build software not just break it. In BIBIFI teams build specified software with the goal of maximizing correctness performance and security. The latter is tested when teams attempt to break other teams submissions. Winners are chosen from among the best builders and the best breakers. BIBIFI was designed to be open-ended - teams can use any language tool process etc. that they like. As such contest outcomes shed light on factors that correlate with successfully building secure software and breaking insecure software. During we ran three contests involving a total of teams and two different programming problems. Quantitative analysis from these contests found that the most efficient build-it submissions used CC but submissions coded in a statically-typed language were less likely to have a security flaw build-it teams with diverse programming-language knowledge also produced more secure code. Shorter programs correlated with better scores. Break-it teams that were also build-it teams were significantly better at finding security bugs. △ Less

Submitted 19 August, 2016; v1 submitted 6 June, 2016; originally announced June 2016.

arXiv:1505.02325 [pdf, other]

Picking vs. Guessing Secrets: A Game-Theoretic Analysis (Technical Report)

Authors: MHR Khouzani, Piotr Mardziel, Carlos Cid, Mudhakar Srivatsa

Abstract: Choosing a hard-to-guess secret is a prerequisite in many security applications. Whether it is a password for user authentication or a secret key for a cryptographic primitive, picking it requires the user to trade-off usability costs with resistance against an adversary: a simple password is easier to remember but is also easier to guess; likewise, a shorter cryptographic key may require fewer co… ▽ More Choosing a hard-to-guess secret is a prerequisite in many security applications. Whether it is a password for user authentication or a secret key for a cryptographic primitive, picking it requires the user to trade-off usability costs with resistance against an adversary: a simple password is easier to remember but is also easier to guess; likewise, a shorter cryptographic key may require fewer computational and storage resources but it is also easier to attack. A fundamental question is how one can optimally resolve this trade-off. A big challenge is the fact that an adversary can also utilize the knowledge of such usability vs. security trade-offs to strengthen its attack. In this paper, we propose a game-theoretic framework for analyzing the optimal trade-offs in the face of strategic adversaries. We consider two types of adversaries: those limited in their number of tries, and those that are ruled by the cost of making individual guesses. For each type, we derive the mutually-optimal decisions as Nash Equilibria, the strategically pessimistic decisions as maximin, and optimal commitments as Strong Stackelberg Equilibria of the game. We establish that when the adversaries are faced with a capped number of guesses, the user's optimal trade-off is a uniform randomization over a subset of the secret domain. On the other hand, when the attacker strategy is ruled by the cost of making individual guesses, Nash Equilibria may completely fail to provide the user with any level of security, signifying the crucial role of credible commitment for such cases. We illustrate our results using numerical examples based on real-world samples and discuss some policy implications of our work. △ Less

Submitted 9 May, 2015; originally announced May 2015.

Comments: This manuscript is the extended version of our conference paper: "Picking vs. Guessing Secrets: A Game-Theoretic Analysis", in 28th IEEE Computer Security Foundations Symposium (CSF 2015), Verona, Italy, July 2015

MSC Class: 91A05; 91A80 ACM Class: K.6.5

Showing 1–18 of 18 results for author: Mardziel, P