Zum Hauptinhalt springen

Showing 1–50 of 64 results for author: Vaughan, J

Searching in archive cs. Search in all archives.
.
  1. arXiv:2408.01300  [pdf

    stat.ML cs.LG

    Assessing Robustness of Machine Learning Models using Covariate Perturbations

    Authors: Arun Prakash R, Anwesha Bhattacharyya, Joel Vaughan, Vijayan N. Nair

    Abstract: As machine learning models become increasingly prevalent in critical decision-making models and systems in fields like finance, healthcare, etc., ensuring their robustness against adversarial attacks and changes in the input data is paramount, especially in cases where models potentially overfit. This paper proposes a comprehensive framework for assessing the robustness of machine learning models… ▽ More

    Submitted 2 August, 2024; originally announced August 2024.

    Comments: 31 pages, 11 figures, 14 tables

  2. arXiv:2408.01057  [pdf, other

    cs.HC

    Supporting Industry Computing Researchers in Assessing, Articulating, and Addressing the Potential Negative Societal Impact of Their Work

    Authors: Wesley Hanwen Deng, Solon Barocas, Jennifer Wortman Vaughan

    Abstract: Recent years have witnessed increasing calls for computing researchers to grapple with the societal impacts of their work. Tools such as impact assessments have gained prominence as a method to uncover potential impacts, and a number of publication venues now encourage authors to include an impact statement in their submissions. Despite this push, little is known about the way researchers assess,… ▽ More

    Submitted 12 August, 2024; v1 submitted 2 August, 2024; originally announced August 2024.

  3. arXiv:2407.11225  [pdf, other

    cs.HC cs.CY

    (De)Noise: Moderating the Inconsistency Between Human Decision-Makers

    Authors: Nina Grgić-Hlača, Junaid Ali, Krishna P. Gummadi, Jennifer Wortman Vaughan

    Abstract: Prior research in psychology has found that people's decisions are often inconsistent. An individual's decisions vary across time, and decisions vary even more across people. Inconsistencies have been identified not only in subjective matters, like matters of taste, but also in settings one might expect to be more objective, such as sentencing, job performance evaluations, or real estate appraisal… ▽ More

    Submitted 15 July, 2024; originally announced July 2024.

    Comments: To appear in CSCW 2024

  4. "I'm Not Sure, But...": Examining the Impact of Large Language Models' Uncertainty Expression on User Reliance and Trust

    Authors: Sunnie S. Y. Kim, Q. Vera Liao, Mihaela Vorvoreanu, Stephanie Ballard, Jennifer Wortman Vaughan

    Abstract: Widely deployed large language models (LLMs) can produce convincing yet incorrect outputs, potentially misleading users who may rely on them as if they were correct. To reduce such overreliance, there have been calls for LLMs to communicate their uncertainty to end users. However, there has been little empirical work examining how users perceive and act upon LLMs' expressions of uncertainty. We ex… ▽ More

    Submitted 15 May, 2024; v1 submitted 1 May, 2024; originally announced May 2024.

    Comments: Accepted to FAccT 2024. This version includes the appendix

  5. arXiv:2402.18777  [pdf

    eess.IV cs.CV

    GDCNet: Calibrationless geometric distortion correction of echo planar imaging data using deep learning

    Authors: Marina Manso Jimeno, Keren Bachi, George Gardner, Yasmin L. Hurd, John Thomas Vaughan Jr., Sairam Geethanath

    Abstract: Functional magnetic resonance imaging techniques benefit from echo-planar imaging's fast image acquisition but are susceptible to inhomogeneities in the main magnetic field, resulting in geometric distortion and signal loss artifacts in the images. Traditional methods leverage a field map or voxel displacement map for distortion correction. However, voxel displacement map estimation requires addit… ▽ More

    Submitted 28 February, 2024; originally announced February 2024.

    Comments: 30 pages, 9 figures, 3 tables

  6. arXiv:2402.08749  [pdf

    cs.CV cs.LG

    Automated detection of motion artifacts in brain MR images using deep learning and explainable artificial intelligence

    Authors: Marina Manso Jimeno, Keerthi Sravan Ravi, Maggie Fung, John Thomas Vaughan, Jr., Sairam Geethanath

    Abstract: Quality assessment, including inspecting the images for artifacts, is a critical step during MRI data acquisition to ensure data quality and downstream analysis or interpretation success. This study demonstrates a deep learning model to detect rigid motion in T1-weighted brain images. We leveraged a 2D CNN for three-class classification and tested it on publicly available retrospective and prospec… ▽ More

    Submitted 13 February, 2024; originally announced February 2024.

    Comments: 25 pages, 9 figures, 1 table. Submitted to NMR in Biomedicine

  7. arXiv:2401.09051  [pdf, other

    cs.HC

    Canvil: Designerly Adaptation for LLM-Powered User Experiences

    Authors: K. J. Kevin Feng, Q. Vera Liao, Ziang Xiao, Jennifer Wortman Vaughan, Amy X. Zhang, David W. McDonald

    Abstract: Advancements in large language models (LLMs) are poised to spark a proliferation of LLM-powered user experiences. In product teams, designers are often tasked with crafting user experiences that align with user needs. To involve designers and leverage their user-centered perspectives to create effective and responsible LLM-powered products, we introduce the practice of designerly adaptation for en… ▽ More

    Submitted 17 January, 2024; originally announced January 2024.

  8. arXiv:2312.06153  [pdf, other

    cs.LG cs.AI cs.HC

    Open Datasheets: Machine-readable Documentation for Open Datasets and Responsible AI Assessments

    Authors: Anthony Cintron Roman, Jennifer Wortman Vaughan, Valerie See, Steph Ballard, Jehu Torres, Caleb Robinson, Juan M. Lavista Ferres

    Abstract: This paper introduces a no-code, machine-readable documentation framework for open datasets, with a focus on responsible AI (RAI) considerations. The framework aims to improve comprehensibility, and usability of open datasets, facilitating easier discovery and use, better understanding of content and context, and evaluation of dataset quality and accuracy. The proposed framework is designed to str… ▽ More

    Submitted 27 March, 2024; v1 submitted 11 December, 2023; originally announced December 2023.

  9. arXiv:2306.03262  [pdf, other

    cs.LG cs.DL

    Has the Machine Learning Review Process Become More Arbitrary as the Field Has Grown? The NeurIPS 2021 Consistency Experiment

    Authors: Alina Beygelzimer, Yann N. Dauphin, Percy Liang, Jennifer Wortman Vaughan

    Abstract: We present the NeurIPS 2021 consistency experiment, a larger-scale variant of the 2014 NeurIPS experiment in which 10% of conference submissions were reviewed by two independent committees to quantify the randomness in the review process. We observe that the two committees disagree on their accept/reject recommendations for 23% of the papers and that, consistent with the results from 2014, approxi… ▽ More

    Submitted 5 June, 2023; originally announced June 2023.

  10. arXiv:2306.01941  [pdf, other

    cs.HC cs.AI cs.CY

    AI Transparency in the Age of LLMs: A Human-Centered Research Roadmap

    Authors: Q. Vera Liao, Jennifer Wortman Vaughan

    Abstract: The rise of powerful large language models (LLMs) brings about tremendous opportunities for innovation but also looming risks for individuals and society at large. We have reached a pivotal moment for ensuring that LLMs and LLM-infused applications are developed and deployed responsibly. However, a central pillar of responsible AI -- transparency -- is largely missing from the current discourse ar… ▽ More

    Submitted 7 August, 2023; v1 submitted 2 June, 2023; originally announced June 2023.

  11. arXiv:2304.05524  [pdf, other

    cs.LG cs.CL

    Understanding Causality with Large Language Models: Feasibility and Opportunities

    Authors: Cheng Zhang, Stefan Bauer, Paul Bennett, Jiangfeng Gao, Wenbo Gong, Agrin Hilmkil, Joel Jennings, Chao Ma, Tom Minka, Nick Pawlowski, James Vaughan

    Abstract: We assess the ability of large language models (LLMs) to answer causal questions by analyzing their strengths and weaknesses against three types of causal question. We believe that current LLMs can answer causal questions with existing causal knowledge as combined domain experts. However, they are not yet able to provide satisfactory answers for discovering new knowledge or for high-stakes decisio… ▽ More

    Submitted 11 April, 2023; originally announced April 2023.

  12. arXiv:2302.14165  [pdf, other

    cs.LG cs.AI cs.HC

    GAM Coach: Towards Interactive and User-centered Algorithmic Recourse

    Authors: Zijie J. Wang, Jennifer Wortman Vaughan, Rich Caruana, Duen Horng Chau

    Abstract: Machine learning (ML) recourse techniques are increasingly used in high-stakes domains, providing end users with actions to alter ML predictions, but they assume ML developers understand what input variables can be changed. However, a recourse plan's actionability is subjective and unlikely to match developers' expectations completely. We present GAM Coach, a novel open-source system that adapts i… ▽ More

    Submitted 28 February, 2023; v1 submitted 27 February, 2023; originally announced February 2023.

    Comments: Accepted to CHI 2023. 20 pages, 12 figures. For a demo video, see https://youtu.be/ubacP34H9XE. For a live demo, visit https://poloclub.github.io/gam-coach/

  13. arXiv:2302.10395  [pdf, other

    cs.HC cs.AI

    Designerly Understanding: Information Needs for Model Transparency to Support Design Ideation for AI-Powered User Experience

    Authors: Q. Vera Liao, Hariharan Subramonyam, Jennifer Wang, Jennifer Wortman Vaughan

    Abstract: Despite the widespread use of artificial intelligence (AI), designing user experiences (UX) for AI-powered systems remains challenging. UX designers face hurdles understanding AI technologies, such as pre-trained language models, as design materials. This limits their ability to ideate and make decisions about whether, where, and how to use AI. To address this problem, we bridge the literature on… ▽ More

    Submitted 20 February, 2023; originally announced February 2023.

    Comments: Accepted at ACM CHI Conference on Human Factors in Computing Systems (CHI 2023)

  14. arXiv:2302.07248  [pdf, other

    cs.HC cs.AI

    Generation Probabilities Are Not Enough: Exploring the Effectiveness of Uncertainty Highlighting in AI-Powered Code Completions

    Authors: Helena Vasconcelos, Gagan Bansal, Adam Fourney, Q. Vera Liao, Jennifer Wortman Vaughan

    Abstract: Large-scale generative models enabled the development of AI-powered code completion tools to assist programmers in writing code. However, much like other AI-powered tools, AI-powered code completions are not always accurate, potentially introducing bugs or even security vulnerabilities into code if not properly detected and corrected by a human programmer. One technique that has been proposed and… ▽ More

    Submitted 14 February, 2023; originally announced February 2023.

  15. arXiv:2301.07255  [pdf, other

    cs.HC cs.AI

    Understanding the Role of Human Intuition on Reliance in Human-AI Decision-Making with Explanations

    Authors: Valerie Chen, Q. Vera Liao, Jennifer Wortman Vaughan, Gagan Bansal

    Abstract: AI explanations are often mentioned as a way to improve human-AI decision-making, but empirical studies have not found consistent evidence of explanations' effectiveness and, on the contrary, suggest that they can increase overreliance when the AI system is wrong. While many factors may affect reliance on AI support, one important factor is how decision-makers reconcile their own intuition -- beli… ▽ More

    Submitted 14 June, 2023; v1 submitted 17 January, 2023; originally announced January 2023.

    Comments: To appear in CSCW 2023

  16. arXiv:2212.01303  [pdf, other

    cs.RO cs.AI

    Selecting Mechanical Parameters of a Monopode Jumping System with Reinforcement Learning

    Authors: Andrew Albright, Joshua Vaughan

    Abstract: Legged systems have many advantages when compared to their wheeled counterparts. For example, they can more easily navigate extreme, uneven terrain. However, there are disadvantages as well, particularly the difficulty seen in modeling the nonlinearities of the system. Research has shown that using flexible components within legged locomotive systems improves performance measures such as efficienc… ▽ More

    Submitted 2 December, 2022; originally announced December 2022.

  17. arXiv:2211.12966  [pdf, other

    cs.LG cs.DB cs.DL

    How do Authors' Perceptions of their Papers Compare with Co-authors' Perceptions and Peer-review Decisions?

    Authors: Charvi Rastogi, Ivan Stelmakh, Alina Beygelzimer, Yann N. Dauphin, Percy Liang, Jennifer Wortman Vaughan, Zhenyu Xue, Hal Daumé III, Emma Pierson, Nihar B. Shah

    Abstract: How do author perceptions match up to the outcomes of the peer-review process and perceptions of others? In a top-tier computer science conference (NeurIPS 2021) with more than 23,000 submitting authors and 9,000 submitted papers, we survey the authors on three questions: (i) their predicted probability of acceptance for each of their papers, (ii) their perceived ranking of their own papers based… ▽ More

    Submitted 22 November, 2022; originally announced November 2022.

  18. arXiv:2211.08536  [pdf

    cs.LG

    Behavior of Hyper-Parameters for Selected Machine Learning Algorithms: An Empirical Investigation

    Authors: Anwesha Bhattacharyya, Joel Vaughan, Vijayan N. Nair

    Abstract: Hyper-parameters (HPs) are an important part of machine learning (ML) model development and can greatly influence performance. This paper studies their behavior for three algorithms: Extreme Gradient Boosting (XGB), Random Forest (RF), and Feedforward Neural Network (FFNN) with structured data. Our empirical investigation examines the qualitative behavior of model performance as the HPs vary, quan… ▽ More

    Submitted 15 November, 2022; originally announced November 2022.

  19. arXiv:2208.02896  [pdf, other

    cs.LG cs.AI

    Interpretable Distribution Shift Detection using Optimal Transport

    Authors: Neha Hulkund, Nicolo Fusi, Jennifer Wortman Vaughan, David Alvarez-Melis

    Abstract: We propose a method to identify and characterize distribution shifts in classification datasets based on optimal transport. It allows the user to identify the extent to which each class is affected by the shift, and retrieves corresponding pairs of samples to provide insights on its nature. We illustrate its use on synthetic and natural shift examples. While the results we present are preliminary,… ▽ More

    Submitted 4 August, 2022; originally announced August 2022.

    Comments: Presented at ICML 2022 DataPerf Workshop

  20. arXiv:2206.15465  [pdf, other

    cs.LG cs.AI cs.HC

    Interpretability, Then What? Editing Machine Learning Models to Reflect Human Knowledge and Values

    Authors: Zijie J. Wang, Alex Kale, Harsha Nori, Peter Stella, Mark E. Nunnally, Duen Horng Chau, Mihaela Vorvoreanu, Jennifer Wortman Vaughan, Rich Caruana

    Abstract: Machine learning (ML) interpretability techniques can reveal undesirable patterns in data that models exploit to make predictions--potentially causing harms once deployed. However, how to take action to address these patterns is not always clear. In a collaboration between ML and human-computer interaction researchers, physicians, and data scientists, we develop GAM Changer, the first interactive… ▽ More

    Submitted 30 June, 2022; originally announced June 2022.

    Comments: Accepted at KDD 2022. 11 pages, 19 figures. For a demo video, see https://youtu.be/D6whtfInqTc. For a live demo, visit https://interpret.ml/gam-changer

  21. arXiv:2206.12353  [pdf

    stat.ML cs.LG

    Quantifying Inherent Randomness in Machine Learning Algorithms

    Authors: Soham Raste, Rahul Singh, Joel Vaughan, Vijayan N. Nair

    Abstract: Most machine learning (ML) algorithms have several stochastic elements, and their performances are affected by these sources of randomness. This paper uses an empirical study to systematically examine the effects of two sources: randomness in model training and randomness in the partitioning of a dataset into training and test subsets. We quantify and compare the magnitude of the variation in pred… ▽ More

    Submitted 24 June, 2022; originally announced June 2022.

    Comments: 14 pages, 4 Figures, 5 tables

  22. arXiv:2206.02923  [pdf, ps, other

    cs.HC cs.AI

    Understanding Machine Learning Practitioners' Data Documentation Perceptions, Needs, Challenges, and Desiderata

    Authors: Amy K. Heger, Liz B. Marquis, Mihaela Vorvoreanu, Hanna Wallach, Jennifer Wortman Vaughan

    Abstract: Data is central to the development and evaluation of machine learning (ML) models. However, the use of problematic or inappropriate datasets can result in harms when the resulting models are deployed. To encourage responsible AI practice through more deliberate reflection on datasets and transparency around the processes by which they are created, researchers and practitioners have begun to advoca… ▽ More

    Submitted 24 August, 2022; v1 submitted 6 June, 2022; originally announced June 2022.

    Comments: Camera-ready preprint of paper accepted to CSCW 2022

  23. arXiv:2205.12723  [pdf

    cs.LG

    Interpretable Feature Engineering for Time Series Predictors using Attention Networks

    Authors: Tianjie Wang, Jie Chen, Joel Vaughan, Vijayan N. Nair

    Abstract: Regression problems with time-series predictors are common in banking and many other areas of application. In this paper, we use multi-head attention networks to develop interpretable features and use them to achieve good predictive performance. The customized attention layer explicitly uses multiplicative interactions and builds feature-engineering heads that capture temporal dynamics in a parsim… ▽ More

    Submitted 23 May, 2022; originally announced May 2022.

  24. arXiv:2205.08363  [pdf, other

    cs.LG cs.AI cs.CY

    REAL ML: Recognizing, Exploring, and Articulating Limitations of Machine Learning Research

    Authors: Jessie J. Smith, Saleema Amershi, Solon Barocas, Hanna Wallach, Jennifer Wortman Vaughan

    Abstract: Transparency around limitations can improve the scientific rigor of research, help ensure appropriate interpretation of research findings, and make research claims more credible. Despite these benefits, the machine learning (ML) research community lacks well-developed norms around disclosing and discussing limitations. To address this gap, we conduct an iterative design process with 30 ML and ML-a… ▽ More

    Submitted 5 May, 2022; originally announced May 2022.

    Comments: This work appears in the 2022 ACM Conference on Fairness, Accountability, and Transparency (FAccT '22)

  25. arXiv:2112.05675  [pdf, ps, other

    cs.AI cs.CY cs.HC

    Assessing the Fairness of AI Systems: AI Practitioners' Processes, Challenges, and Needs for Support

    Authors: Michael Madaio, Lisa Egede, Hariharan Subramonyam, Jennifer Wortman Vaughan, Hanna Wallach

    Abstract: Various tools and practices have been developed to support practitioners in identifying, assessing, and mitigating fairness-related harms caused by AI systems. However, prior research has highlighted gaps between the intended design of these tools and practices and their use within particular contexts, including gaps caused by the role that organizational factors play in shaping fairness work. In… ▽ More

    Submitted 10 February, 2022; v1 submitted 10 December, 2021; originally announced December 2021.

    Comments: Camera-ready preprint of paper accepted to the CSCW conference

  26. arXiv:2112.03245  [pdf, other

    cs.LG cs.AI cs.HC

    GAM Changer: Editing Generalized Additive Models with Interactive Visualization

    Authors: Zijie J. Wang, Alex Kale, Harsha Nori, Peter Stella, Mark Nunnally, Duen Horng Chau, Mihaela Vorvoreanu, Jennifer Wortman Vaughan, Rich Caruana

    Abstract: Recent strides in interpretable machine learning (ML) research reveal that models exploit undesirable patterns in the data to make predictions, which potentially causes harms in deployment. However, it is unclear how we can fix these models. We present our ongoing work, GAM Changer, an open-source interactive system to help data scientists and domain experts easily and responsibly edit their Gener… ▽ More

    Submitted 6 December, 2021; originally announced December 2021.

    Comments: 7 pages, 15 figures, accepted to the Research2Clinics workshop at NeurIPS 2021. For a demo video, see https://youtu.be/2gVSoPoSeJ8. For a live demo, visit https://interpret.ml/gam-changer/

  27. arXiv:2111.08922  [pdf, other

    cs.LG math.OC stat.ML

    Traversing the Local Polytopes of ReLU Neural Networks: A Unified Approach for Network Verification

    Authors: Shaojie Xu, Joel Vaughan, Jie Chen, Aijun Zhang, Agus Sudjianto

    Abstract: Although neural networks (NNs) with ReLU activation functions have found success in a wide range of applications, their adoption in risk-sensitive settings has been limited by the concerns on robustness and interpretability. Previous works to examine robustness and to improve interpretability partially exploited the piecewise linear function form of ReLU NNs. In this paper, we explore the unique t… ▽ More

    Submitted 9 January, 2022; v1 submitted 17 November, 2021; originally announced November 2021.

  28. arXiv:2109.04244  [pdf, other

    stat.ML cs.LG

    Supervised Linear Dimension-Reduction Methods: Review, Extensions, and Comparisons

    Authors: Shaojie Xu, Joel Vaughan, Jie Chen, Agus Sudjianto, Vijayan Nair

    Abstract: Principal component analysis (PCA) is a well-known linear dimension-reduction method that has been widely used in data analysis and modeling. It is an unsupervised learning technique that identifies a suitable linear subspace for the input variable that contains maximal variation and preserves as much information as possible. PCA has also been used in prediction models where the original, high-dim… ▽ More

    Submitted 9 September, 2021; originally announced September 2021.

  29. arXiv:2104.13299  [pdf, other

    cs.AI cs.LG

    From Human Explanation to Model Interpretability: A Framework Based on Weight of Evidence

    Authors: David Alvarez-Melis, Harmanpreet Kaur, Hal Daumé III, Hanna Wallach, Jennifer Wortman Vaughan

    Abstract: We take inspiration from the study of human explanation to inform the design and evaluation of interpretability methods in machine learning. First, we survey the literature on human explanation in philosophy, cognitive science, and the social sciences, and propose a list of design principles for machine-generated explanations that are meaningful to humans. Using the concept of weight of evidence f… ▽ More

    Submitted 20 September, 2021; v1 submitted 27 April, 2021; originally announced April 2021.

    Comments: HCOMP 2021

  30. arXiv:2104.01052  [pdf, other

    cs.DL cs.LO

    An Evaluation of the Archive of Formal Proofs

    Authors: Carlin MacKenzie, Jacques Fleuriot, James Vaughan

    Abstract: The Archive of Formal Proofs (AFP) is an online repository of formal proofs for the Isabelle proof assistant. It serves as a central location for publishing, discovering, and viewing libraries of proofs. We conducted an online survey in November 2020 to assess the suitability of the website. In this report, we present and discuss the results, which showed that long-term users of the website are ge… ▽ More

    Submitted 2 April, 2021; originally announced April 2021.

  31. arXiv:2103.06076  [pdf, other

    cs.CY cs.AI cs.LG

    Designing Disaggregated Evaluations of AI Systems: Choices, Considerations, and Tradeoffs

    Authors: Solon Barocas, Anhong Guo, Ece Kamar, Jacquelyn Krones, Meredith Ringel Morris, Jennifer Wortman Vaughan, Duncan Wadsworth, Hanna Wallach

    Abstract: Disaggregated evaluations of AI systems, in which system performance is assessed and reported separately for different groups of people, are conceptually simple. However, their design involves a variety of choices. Some of these choices influence the results that will be obtained, and thus the conclusions that can be drawn; others influence the impacts -- both beneficial and harmful -- that a disa… ▽ More

    Submitted 1 December, 2021; v1 submitted 10 March, 2021; originally announced March 2021.

  32. arXiv:2101.01816  [pdf, ps, other

    cs.GT

    Incentive-Compatible Forecasting Competitions

    Authors: Jens Witkowski, Rupert Freeman, Jennifer Wortman Vaughan, David M. Pennock, Andreas Krause

    Abstract: We initiate the study of incentive-compatible forecasting competitions in which multiple forecasters make predictions about one or more events and compete for a single prize. We have two objectives: (1) to incentivize forecasters to report truthfully and (2) to award the prize to the most accurate forecaster. Proper scoring rules incentivize truthful reporting if all forecasters are paid according… ▽ More

    Submitted 7 September, 2021; v1 submitted 5 January, 2021; originally announced January 2021.

    Comments: 38 pages. Relative to the previous version Appendix A and Theorem 5 are new. This version additionally contains some expanded exposition

  33. arXiv:2010.13187  [pdf, other

    stat.ML cs.CV cs.LG

    Improving the Reconstruction of Disentangled Representation Learners via Multi-Stage Modeling

    Authors: Akash Srivastava, Yamini Bansal, Yukun Ding, Cole Lincoln Hurwitz, Kai Xu, Bernhard Egger, Prasanna Sattigeri, Joshua B. Tenenbaum, Phuong Le, Arun Prakash R, Nengfeng Zhou, Joel Vaughan, Yaquan Wang, Anwesha Bhattacharyya, Kristjan Greenewald, David D. Cox, Dan Gutfreund

    Abstract: Current autoencoder-based disentangled representation learning methods achieve disentanglement by penalizing the (aggregate) posterior to encourage statistical independence of the latent factors. This approach introduces a trade-off between disentangled representation learning and reconstruction quality since the model does not have enough capacity to learn correlated latent variables that capture… ▽ More

    Submitted 3 April, 2024; v1 submitted 25 October, 2020; originally announced October 2020.

  34. arXiv:2008.04059  [pdf

    q-fin.GN cs.LG stat.ML

    Supervised Machine Learning Techniques: An Overview with Applications to Banking

    Authors: Linwei Hu, Jie Chen, Joel Vaughan, Hanyu Yang, Kelly Wang, Agus Sudjianto, Vijayan N. Nair

    Abstract: This article provides an overview of Supervised Machine Learning (SML) with a focus on applications to banking. The SML techniques covered include Bagging (Random Forest or RF), Boosting (Gradient Boosting Machine or GBM) and Neural Networks (NNs). We begin with an introduction to ML tasks and techniques. This is followed by a description of: i) tree-based ensemble algorithms including Bagging wit… ▽ More

    Submitted 28 July, 2020; originally announced August 2020.

  35. arXiv:2007.03661  [pdf

    cs.CY cs.HC

    Mathematical Foundations for Social Computing

    Authors: Yiling Chen, Arpita Ghosh, Michael Kearns, Tim Roughgarden, Jennifer Wortman Vaughan

    Abstract: Social computing encompasses the mechanisms through which people interact with computational systems: crowdsourcing systems, ranking and recommendation systems, online prediction markets, citizen science projects, and collaboratively edited wikis, to name a few. These systems share the common feature that humans are active participants, making choices that determine the input to, and therefore the… ▽ More

    Submitted 7 July, 2020; originally announced July 2020.

    Comments: A Computing Community Consortium (CCC) workshop report, 15 pages

    Report number: ccc2014report_5

  36. arXiv:2005.10624  [pdf, ps, other

    cs.LG stat.ML

    Greedy Algorithm almost Dominates in Smoothed Contextual Bandits

    Authors: Manish Raghavan, Aleksandrs Slivkins, Jennifer Wortman Vaughan, Zhiwei Steven Wu

    Abstract: Online learning algorithms, widely used to power search and content optimization on the web, must balance exploration and exploitation, potentially sacrificing the experience of current users in order to gain information that will lead to better decisions in the future. While necessary in the worst case, explicit exploration has a number of disadvantages compared to the greedy algorithm that alway… ▽ More

    Submitted 27 December, 2021; v1 submitted 19 May, 2020; originally announced May 2020.

    Comments: Results in this paper, without any proofs, have been announced in an extended abstract (Raghavan et al., 2018a), and fleshed out in the technical report (Raghavan et al., 2018b [arXiv:1806.00543]). This manuscript covers a subset of results from Raghavan et al. (2018a,b), focusing on the greedy algorithm, and is streamlined accordingly

  37. arXiv:2004.02353  [pdf

    stat.ML cs.AI cs.LG

    Adaptive Explainable Neural Networks (AxNNs)

    Authors: Jie Chen, Joel Vaughan, Vijayan N. Nair, Agus Sudjianto

    Abstract: While machine learning techniques have been successfully applied in several fields, the black-box nature of the models presents challenges for interpreting and explaining the results. We develop a new framework called Adaptive Explainable Neural Networks (AxNN) for achieving the dual goals of good predictive performance and model interpretability. For predictive performance, we build a structured… ▽ More

    Submitted 2 June, 2020; v1 submitted 5 April, 2020; originally announced April 2020.

  38. arXiv:2002.08837  [pdf, other

    cs.LG cs.GT stat.ML

    No-Regret and Incentive-Compatible Online Learning

    Authors: Rupert Freeman, David M. Pennock, Chara Podimata, Jennifer Wortman Vaughan

    Abstract: We study online learning settings in which experts act strategically to maximize their influence on the learning algorithm's predictions by potentially misreporting their beliefs about a sequence of binary events. Our goal is twofold. First, we want the learning algorithm to be no-regret with respect to the best fixed expert in hindsight. Second, we want incentive compatibility, a guarantee that e… ▽ More

    Submitted 30 June, 2020; v1 submitted 20 February, 2020; originally announced February 2020.

    Comments: Appears in ICML2020

  39. arXiv:2001.05551  [pdf, other

    q-bio.QM cs.CV eess.IV

    Substituting Gadolinium in Brain MRI Using DeepContrast

    Authors: Haoran Sun, Xueqing Liu, Xinyang Feng, Chen Liu, Nanyan Zhu, Sabrina J. Gjerswold-Selleck, Hong-Jian Wei, Pavan S. Upadhyayula, Angeliki Mela, Cheng-Chia Wu, Peter D. Canoll, Andrew F. Laine, J. Thomas Vaughan, Scott A. Small, Jia Guo

    Abstract: Cerebral blood volume (CBV) is a hemodynamic correlate of oxygen metabolism and reflects brain activity and function. High-resolution CBV maps can be generated using the steady-state gadolinium-enhanced MRI technique. Such a technique requires an intravenous injection of exogenous gadolinium based contrast agent (GBCA) and recent studies suggest that the GBCA can accumulate in the brain after freq… ▽ More

    Submitted 15 January, 2020; originally announced January 2020.

    Journal ref: The IEEE International Symposium on Biomedical Imaging (ISBI) 2020

  40. arXiv:1910.13503  [pdf, other

    cs.LG cs.AI stat.ML

    Weight of Evidence as a Basis for Human-Oriented Explanations

    Authors: David Alvarez-Melis, Hal Daumé III, Jennifer Wortman Vaughan, Hanna Wallach

    Abstract: Interpretability is an elusive but highly sought-after characteristic of modern machine learning methods. Recent work has focused on interpretability via $\textit{explanations}$, which justify individual model predictions. In this work, we take a step towards reconciling machine explanations with those that humans produce and prefer by taking inspiration from the study of explanation in philosophy… ▽ More

    Submitted 29 October, 2019; originally announced October 2019.

    Comments: Human-Centric Machine Learning (HCML) Workshop @ NeurIPS 2019

  41. arXiv:1907.02227  [pdf, other

    cs.CY cs.AI cs.HC

    Toward Fairness in AI for People with Disabilities: A Research Roadmap

    Authors: Anhong Guo, Ece Kamar, Jennifer Wortman Vaughan, Hanna Wallach, Meredith Ringel Morris

    Abstract: AI technologies have the potential to dramatically impact the lives of people with disabilities (PWD). Indeed, improving the lives of PWD is a motivator for many state-of-the-art AI systems, such as automated speech recognition tools that can caption videos for people who are deaf and hard of hearing, or language prediction algorithms that can augment communication for people with speech or cognit… ▽ More

    Submitted 2 August, 2019; v1 submitted 4 July, 2019; originally announced July 2019.

    Comments: ACM ASSETS 2019 Workshop on AI Fairness for People with Disabilities

  42. Truthful Aggregation of Budget Proposals

    Authors: Rupert Freeman, David M. Pennock, Dominik Peters, Jennifer Wortman Vaughan

    Abstract: We consider a participatory budgeting problem in which each voter submits a proposal for how to divide a single divisible resource (such as money or time) among several possible alternatives (such as public projects or activities) and these proposals must be aggregated into a single aggregate division. Under $\ell_1$ preferences -- for which a voter's disutility is given by the $\ell_1$ distance b… ▽ More

    Submitted 21 January, 2022; v1 submitted 1 May, 2019; originally announced May 2019.

    Comments: 28 pages, final journal version

    Journal ref: Journal of Economic Theory, Volume 193, April 2021, 105234

  43. arXiv:1812.05239  [pdf, other

    cs.HC cs.CY cs.LG cs.SE

    Improving fairness in machine learning systems: What do industry practitioners need?

    Authors: Kenneth Holstein, Jennifer Wortman Vaughan, Hal Daumé III, Miro Dudík, Hanna Wallach

    Abstract: The potential for machine learning (ML) systems to amplify social inequities and unfairness is receiving increasing popular and academic attention. A surge of recent work has focused on the development of algorithmic tools to assess and mitigate such unfairness. If these tools are to have a positive impact on industry practice, however, it is crucial that their design be informed by an understandi… ▽ More

    Submitted 7 January, 2019; v1 submitted 12 December, 2018; originally announced December 2018.

    Comments: To appear in the 2019 ACM CHI Conference on Human Factors in Computing Systems (CHI 2019)

  44. arXiv:1808.08646  [pdf, other

    cs.LG cs.GT stat.ML

    The Disparate Effects of Strategic Manipulation

    Authors: Lily Hu, Nicole Immorlica, Jennifer Wortman Vaughan

    Abstract: When consequential decisions are informed by algorithmic input, individuals may feel compelled to alter their behavior in order to gain a system's approval. Models of agent responsiveness, termed "strategic manipulation," analyze the interaction between a learner and agents in a world where all agents are equally able to manipulate their features in an attempt to "trick" a published classifier. In… ▽ More

    Submitted 10 May, 2019; v1 submitted 26 August, 2018; originally announced August 2018.

    Comments: 29 pages, 4 figures

  45. arXiv:1808.07216  [pdf

    stat.ML cs.LG

    Model Interpretation: A Unified Derivative-based Framework for Nonparametric Regression and Supervised Machine Learning

    Authors: Xiaoyu Liu, Jie Chen, Joel Vaughan, Vijayan Nair, Agus Sudjianto

    Abstract: Interpreting a nonparametric regression model with many predictors is known to be a challenging problem. There has been renewed interest in this topic due to the extensive use of machine learning algorithms and the difficulty in understanding and explaining their input-output relationships. This paper develops a unified framework using a derivative-based approach for existing tools in the literatu… ▽ More

    Submitted 8 September, 2018; v1 submitted 22 August, 2018; originally announced August 2018.

  46. arXiv:1806.05740  [pdf, other

    cs.CY cs.AI cs.CL

    Using Search Queries to Understand Health Information Needs in Africa

    Authors: Rediet Abebe, Shawndra Hill, Jennifer Wortman Vaughan, Peter M. Small, H. Andrew Schwartz

    Abstract: The lack of comprehensive, high-quality health data in developing nations creates a roadblock for combating the impacts of disease. One key challenge is understanding the health information needs of people in these nations. Without understanding people's everyday needs, concerns, and misconceptions, health organizations and policymakers lack the ability to effectively target education and programm… ▽ More

    Submitted 17 April, 2019; v1 submitted 14 June, 2018; originally announced June 2018.

    Comments: Extended version of an ICWSM 2019 paper

  47. arXiv:1806.01933  [pdf, other

    stat.ML cs.LG

    Explainable Neural Networks based on Additive Index Models

    Authors: Joel Vaughan, Agus Sudjianto, Erind Brahimi, Jie Chen, Vijayan N. Nair

    Abstract: Machine Learning algorithms are increasingly being used in recent years due to their flexibility in model fitting and increased predictive performance. However, the complexity of the models makes them hard for the data analyst to interpret the results and explain them without additional tools. This has led to much research in developing various approaches to understand the model behavior. In this… ▽ More

    Submitted 5 June, 2018; originally announced June 2018.

    Comments: 10 pages, 8 figures

  48. arXiv:1806.00543  [pdf, ps, other

    cs.LG cs.CY stat.ML

    The Externalities of Exploration and How Data Diversity Helps Exploitation

    Authors: Manish Raghavan, Aleksandrs Slivkins, Jennifer Wortman Vaughan, Zhiwei Steven Wu

    Abstract: Online learning algorithms, widely used to power search and content optimization on the web, must balance exploration and exploitation, potentially sacrificing the experience of current users for information that will lead to better decisions in the future. Recently, concerns have been raised about whether the process of exploration could be viewed as unfair, placing too much burden on certain ind… ▽ More

    Submitted 2 July, 2018; v1 submitted 1 June, 2018; originally announced June 2018.

  49. arXiv:1803.09010  [pdf, other

    cs.DB cs.AI cs.LG

    Datasheets for Datasets

    Authors: Timnit Gebru, Jamie Morgenstern, Briana Vecchione, Jennifer Wortman Vaughan, Hanna Wallach, Hal Daumé III, Kate Crawford

    Abstract: The machine learning community currently has no standardized process for documenting datasets, which can lead to severe consequences in high-stakes domains. To address this gap, we propose datasheets for datasets. In the electronics industry, every component, no matter how simple or complex, is accompanied with a datasheet that describes its operating characteristics, test results, recommended use… ▽ More

    Submitted 1 December, 2021; v1 submitted 23 March, 2018; originally announced March 2018.

    Comments: Published in CACM in December, 2021

  50. arXiv:1802.07810  [pdf, other

    cs.AI cs.CY

    Manipulating and Measuring Model Interpretability

    Authors: Forough Poursabzi-Sangdeh, Daniel G. Goldstein, Jake M. Hofman, Jennifer Wortman Vaughan, Hanna Wallach

    Abstract: With machine learning models being increasingly used to aid decision making even in high-stakes domains, there has been a growing interest in developing interpretable models. Although many supposedly interpretable models have been proposed, there have been relatively few experimental studies investigating whether these models achieve their intended effects, such as making people more closely follo… ▽ More

    Submitted 15 August, 2021; v1 submitted 21 February, 2018; originally announced February 2018.

    ACM Class: I.2