Zum Hauptinhalt springen

Showing 1–18 of 18 results for author: Mardziel, P

Searching in archive cs. Search in all archives.
.
  1. arXiv:2402.04489  [pdf, other

    cs.LG cs.CR cs.CY stat.ME

    De-amplifying Bias from Differential Privacy in Language Model Fine-tuning

    Authors: Sanjari Srivastava, Piotr Mardziel, Zhikhun Zhang, Archana Ahlawat, Anupam Datta, John C Mitchell

    Abstract: Fairness and privacy are two important values machine learning (ML) practitioners often seek to operationalize in models. Fairness aims to reduce model bias for social/demographic sub-groups. Privacy via differential privacy (DP) mechanisms, on the other hand, limits the impact of any individual's training data on the resulting model. The trade-offs between privacy and fairness goals of trustworth… ▽ More

    Submitted 6 February, 2024; originally announced February 2024.

  2. arXiv:2011.00740  [pdf, other

    cs.CL

    Influence Patterns for Explaining Information Flow in BERT

    Authors: Kaiji Lu, Zifan Wang, Piotr Mardziel, Anupam Datta

    Abstract: While attention is all you need may be proving true, we do not know why: attention-based transformer models such as BERT are superior but how information flows from input tokens to output predictions are unclear. We introduce influence patterns, abstractions of sets of paths through a transformer model. Patterns quantify and localize the flow of information to paths passing through a sequence of m… ▽ More

    Submitted 30 November, 2021; v1 submitted 1 November, 2020; originally announced November 2020.

    Comments: Neurips 2021

  3. arXiv:2009.08507  [pdf, other

    cs.AI cs.LG

    Reconstructing Actions To Explain Deep Reinforcement Learning

    Authors: Xuan Chen, Zifan Wang, Yucai Fan, Bonan Jin, Piotr Mardziel, Carlee Joe-Wong, Anupam Datta

    Abstract: Feature attribution has been a foundational building block for explaining the input feature importance in supervised learning with Deep Neural Network (DNNs), but face new challenges when applied to deep Reinforcement Learning (RL).We propose a new approach to explaining deep RL actions by defining a class of \emph{action reconstruction} functions that mimic the behavior of a network in deep RL. T… ▽ More

    Submitted 12 February, 2021; v1 submitted 17 September, 2020; originally announced September 2020.

  4. arXiv:2006.07986  [pdf, other

    cs.IT cs.CR cs.CY cs.LG stat.ML

    Fairness Under Feature Exemptions: Counterfactual and Observational Measures

    Authors: Sanghamitra Dutta, Praveen Venkatesh, Piotr Mardziel, Anupam Datta, Pulkit Grover

    Abstract: With the growing use of ML in highly consequential domains, quantifying disparity with respect to protected attributes, e.g., gender, race, etc., is important. While quantifying disparity is essential, sometimes the needs of an occupation may require the use of certain features that are critical in a way that any disparity that can be explained by them might need to be exempted. E.g., in hiring a… ▽ More

    Submitted 6 August, 2021; v1 submitted 14 June, 2020; originally announced June 2020.

    Comments: Accepted at the IEEE Transactions on Information Theory (Shorter version at AAAI 2020 as an oral presentation)

  5. arXiv:2006.06643  [pdf, other

    cs.LG stat.ML

    Smoothed Geometry for Robust Attribution

    Authors: Zifan Wang, Haofan Wang, Shakul Ramkumar, Matt Fredrikson, Piotr Mardziel, Anupam Datta

    Abstract: Feature attributions are a popular tool for explaining the behavior of Deep Neural Networks (DNNs), but have recently been shown to be vulnerable to attacks that produce divergent explanations for nearby inputs. This lack of robustness is especially problematic in high-stakes applications where adversarially-manipulated explanations could impair safety and trustworthiness. Building on a geometric… ▽ More

    Submitted 22 October, 2020; v1 submitted 11 June, 2020; originally announced June 2020.

  6. arXiv:2005.01190  [pdf, other

    cs.CL

    Influence Paths for Characterizing Subject-Verb Number Agreement in LSTM Language Models

    Authors: Kaiji Lu, Piotr Mardziel, Klas Leino, Matt Fedrikson, Anupam Datta

    Abstract: LSTM-based recurrent neural networks are the state-of-the-art for many natural language processing (NLP) tasks. Despite their performance, it is unclear whether, or how, LSTMs learn structural features of natural languages such as subject-verb number agreement in English. Lacking this understanding, the generality of LSTM performance on this task and their suitability for related tasks remains unc… ▽ More

    Submitted 3 May, 2020; originally announced May 2020.

    Comments: ACL 2020

  7. arXiv:2002.07985  [pdf, other

    cs.AI

    Interpreting Interpretations: Organizing Attribution Methods by Criteria

    Authors: Zifan Wang, Piotr Mardziel, Anupam Datta, Matt Fredrikson

    Abstract: Motivated by distinct, though related, criteria, a growing number of attribution methods have been developed tointerprete deep learning. While each relies on the interpretability of the concept of "importance" and our ability to visualize patterns, explanations produced by the methods often differ. As a result, input attribution for vision models fail to provide any level of human understanding of… ▽ More

    Submitted 4 April, 2020; v1 submitted 18 February, 2020; originally announced February 2020.

  8. arXiv:1910.01279  [pdf, other

    cs.CV

    Score-CAM: Score-Weighted Visual Explanations for Convolutional Neural Networks

    Authors: Haofan Wang, Zifan Wang, Mengnan Du, Fan Yang, Zijian Zhang, Sirui Ding, Piotr Mardziel, Xia Hu

    Abstract: Recently, increasing attention has been drawn to the internal mechanisms of convolutional neural networks, and the reason why the network makes specific decisions. In this paper, we develop a novel post-hoc visual explanation method called Score-CAM based on class activation mapping. Unlike previous class activation mapping based approaches, Score-CAM gets rid of the dependence on gradients by obt… ▽ More

    Submitted 13 April, 2020; v1 submitted 2 October, 2019; originally announced October 2019.

    Comments: Accepted to CVPR 2020: Workshop on Fair, Data Efficient and Trusted Computer Vision

  9. arXiv:1907.01679  [pdf, other

    cs.CR

    Build It, Break It, Fix It: Contesting Secure Development

    Authors: James Parker, Michael Hicks, Andrew Ruef, Michelle L. Mazurek, Dave Levin, Daniel Votipka, Piotr Mardziel, Kelsey R. Fulton

    Abstract: Typical security contests focus on breaking or mitigating the impact of buggy systems. We present the Build-it, Break-it, Fix-it (BIBIFI) contest, which aims to assess the ability to securely build software, not just break it. In BIBIFI, teams build specified software with the goal of maximizing correctness, performance, and security. The latter is tested when teams attempt to break other teams' s… ▽ More

    Submitted 2 July, 2019; originally announced July 2019.

    Comments: 35pgs. Extension of arXiv:1606.01881 which was a conference paper previously published in CCS 2016. This is a journal version submitted to TOPS

  10. arXiv:1807.11714  [pdf, other

    cs.CL

    Gender Bias in Neural Natural Language Processing

    Authors: Kaiji Lu, Piotr Mardziel, Fangjing Wu, Preetam Amancharla, Anupam Datta

    Abstract: We examine whether neural natural language processing (NLP) systems reflect historical biases in training data. We define a general benchmark to quantify gender bias in a variety of neural NLP tasks. Our empirical evaluation with state-of-the-art neural coreference resolution and textbook RNN-based language models trained on benchmark datasets finds significant gender bias in how models view occup… ▽ More

    Submitted 30 May, 2019; v1 submitted 31 July, 2018; originally announced July 2018.

  11. arXiv:1803.10815  [pdf, other

    cs.LG stat.ML

    Supervising Feature Influence

    Authors: Shayak Sen, Piotr Mardziel, Anupam Datta, Matthew Fredrikson

    Abstract: Causal influence measures for machine learnt classifiers shed light on the reasons behind classification, and aid in identifying influential input features and revealing their biases. However, such analyses involve evaluating the classifier using datapoints that may be atypical of its training distribution. Standard methods for training classifiers that minimize empirical risk do not constrain the… ▽ More

    Submitted 7 April, 2018; v1 submitted 28 March, 2018; originally announced March 2018.

  12. Evaluating Design Tradeoffs in Numeric Static Analysis for Java

    Authors: Shiyi Wei, Piotr Mardziel, Andrew Ruef, Jeffrey S. Foster, Michael Hicks

    Abstract: Numeric static analysis for Java has a broad range of potentially useful applications, including array bounds checking and resource usage estimation. However, designing a scalable numeric static analysis for real-world Java programs presents a multitude of design choices, each of which may interact with others. For example, an analysis could handle method calls via either a top-down or bottom-up i… ▽ More

    Submitted 24 February, 2018; originally announced February 2018.

  13. arXiv:1711.10816  [pdf, other

    cs.IR

    Latent Factor Interpretations for Collaborative Filtering

    Authors: Anupam Datta, Sophia Kovaleva, Piotr Mardziel, Shayak Sen

    Abstract: Many machine learning systems utilize latent factors as internal representations for making predictions. Since these latent factors are largely uninterpreted, however, predictions made using them are opaque. Collaborative filtering via matrix factorization is a prime example of such an algorithm that uses uninterpreted latent features, and yet has seen widespread adoption for many recommendation t… ▽ More

    Submitted 9 April, 2018; v1 submitted 29 November, 2017; originally announced November 2017.

  14. arXiv:1707.08120  [pdf, other

    cs.CY cs.LG

    Proxy Non-Discrimination in Data-Driven Systems

    Authors: Anupam Datta, Matt Fredrikson, Gihyuk Ko, Piotr Mardziel, Shayak Sen

    Abstract: Machine learnt systems inherit biases against protected classes, historically disparaged groups, from training data. Usually, these biases are not explicit, they rely on subtle correlations discovered by training algorithms, and are therefore difficult to detect. We formalize proxy discrimination in data-driven systems, a class of properties indicative of bias, as the presence of protected class c… ▽ More

    Submitted 25 July, 2017; originally announced July 2017.

    Comments: arXiv admin note: substantial text overlap with arXiv:1705.07807

  15. arXiv:1705.07807  [pdf, other

    cs.CR cs.LG

    Use Privacy in Data-Driven Systems: Theory and Experiments with Machine Learnt Programs

    Authors: Anupam Datta, Matthew Fredrikson, Gihyuk Ko, Piotr Mardziel, Shayak Sen

    Abstract: This paper presents an approach to formalizing and enforcing a class of use privacy properties in data-driven systems. In contrast to prior work, we focus on use restrictions on proxies (i.e. strong predictors) of protected information types. Our definition relates proxy use to intermediate computations that occur in a program, and identify two essential properties that characterize this behavior:… ▽ More

    Submitted 7 September, 2017; v1 submitted 22 May, 2017; originally announced May 2017.

    Comments: extended CCS 2017 camera-ready: several new discussions, and complexity results added to appendix

  16. arXiv:1701.04174  [pdf, other

    cs.CR

    Quantifying vulnerability of secret generation using hyper-distributions (extended version)

    Authors: Mário S. Alvim, Piotr Mardziel, Michael Hicks

    Abstract: Traditional approaches to Quantitative Information Flow (QIF) represent the adversary's prior knowledge of possible secret values as a single probability distribution. This representation may miss important structure. For instance, representing prior knowledge about passwords of a system's users in this way overlooks the fact that many users generate passwords using some strategy. Knowledge of suc… ▽ More

    Submitted 21 January, 2017; v1 submitted 16 January, 2017; originally announced January 2017.

  17. Build It, Break It, Fix It: Contesting Secure Development

    Authors: Andrew Ruef, Michael Hicks, James Parker, Dave Levin, Michelle L. Mazurek, Piotr Mardziel

    Abstract: Typical security contests focus on breaking or mitigating the impact of buggy systems. We present the Build-it Break-it Fix-it BIBIFI contest which aims to assess the ability to securely build software not just break it. In BIBIFI teams build specified software with the goal of maximizing correctness performance and security. The latter is tested when teams attempt to break other teams submissions… ▽ More

    Submitted 19 August, 2016; v1 submitted 6 June, 2016; originally announced June 2016.

  18. arXiv:1505.02325  [pdf, other

    cs.GT cs.CR math.OC

    Picking vs. Guessing Secrets: A Game-Theoretic Analysis (Technical Report)

    Authors: MHR Khouzani, Piotr Mardziel, Carlos Cid, Mudhakar Srivatsa

    Abstract: Choosing a hard-to-guess secret is a prerequisite in many security applications. Whether it is a password for user authentication or a secret key for a cryptographic primitive, picking it requires the user to trade-off usability costs with resistance against an adversary: a simple password is easier to remember but is also easier to guess; likewise, a shorter cryptographic key may require fewer co… ▽ More

    Submitted 9 May, 2015; originally announced May 2015.

    Comments: This manuscript is the extended version of our conference paper: "Picking vs. Guessing Secrets: A Game-Theoretic Analysis", in 28th IEEE Computer Security Foundations Symposium (CSF 2015), Verona, Italy, July 2015

    MSC Class: 91A05; 91A80 ACM Class: K.6.5