Search | arXiv e-print repository

Understanding the Relationship between Prompts and Response Uncertainty in Large Language Models

Authors: Ze Yu Zhang, Arun Verma, Finale Doshi-Velez, Bryan Kian Hsiang Low

Abstract: Large language models (LLMs) are widely used in decision-making, but their reliability, especially in critical tasks like healthcare, is not well-established. Therefore, understanding how LLMs reason and make decisions is crucial for their safe deployment. This paper investigates how the uncertainty of responses generated by LLMs relates to the information provided in the input prompt. Leveraging… ▽ More Large language models (LLMs) are widely used in decision-making, but their reliability, especially in critical tasks like healthcare, is not well-established. Therefore, understanding how LLMs reason and make decisions is crucial for their safe deployment. This paper investigates how the uncertainty of responses generated by LLMs relates to the information provided in the input prompt. Leveraging the insight that LLMs learn to infer latent concepts during pretraining, we propose a prompt-response concept model that explains how LLMs generate responses and helps understand the relationship between prompts and response uncertainty. We show that the uncertainty decreases as the prompt's informativeness increases, similar to epistemic uncertainty. Our detailed experimental results on real datasets validate our proposed model. △ Less

Submitted 21 August, 2024; v1 submitted 20 July, 2024; originally announced July 2024.

Comments: 27 pages, 13 figures

arXiv:2406.08636 [pdf, other]

Towards Integrating Personal Knowledge into Test-Time Predictions

Authors: Isaac Lage, Sonali Parbhoo, Finale Doshi-Velez

Abstract: Machine learning (ML) models can make decisions based on large amounts of data, but they can be missing personal knowledge available to human users about whom predictions are made. For example, a model trained to predict psychiatric outcomes may know nothing about a patient's social support system, and social support may look different for different patients. In this work, we introduce the problem… ▽ More Machine learning (ML) models can make decisions based on large amounts of data, but they can be missing personal knowledge available to human users about whom predictions are made. For example, a model trained to predict psychiatric outcomes may know nothing about a patient's social support system, and social support may look different for different patients. In this work, we introduce the problem of human feature integration, which provides a way to incorporate important personal-knowledge from users without domain expertise into ML predictions. We characterize this problem through illustrative user stories and comparisons to existing approaches; we formally describe this problem in a way that paves the ground for future technical solutions; and we provide a proof-of-concept study of a simple version of a solution to this problem in a semi-realistic setting. △ Less

Submitted 12 June, 2024; originally announced June 2024.

arXiv:2406.00116 [pdf, other]

A Sim2Real Approach for Identifying Task-Relevant Properties in Interpretable Machine Learning

Authors: Eura Nofshin, Esther Brown, Brian Lim, Weiwei Pan, Finale Doshi-Velez

Abstract: Existing user studies suggest that different tasks may require explanations with different properties. However, user studies are expensive. In this paper, we introduce a generalizable, cost-effective method for identifying task-relevant explanation properties in silico, which can guide the design of more expensive user studies. We use our approach to identify relevant proxies for three example tas… ▽ More Existing user studies suggest that different tasks may require explanations with different properties. However, user studies are expensive. In this paper, we introduce a generalizable, cost-effective method for identifying task-relevant explanation properties in silico, which can guide the design of more expensive user studies. We use our approach to identify relevant proxies for three example tasks and validate our simulation with real user studies. △ Less

Submitted 31 May, 2024; originally announced June 2024.

arXiv:2404.14660 [pdf, ps, other]

AI Procurement Checklists: Revisiting Implementation in the Age of AI Governance

Authors: Tom Zick, Mason Kortz, David Eaves, Finale Doshi-Velez

Abstract: Public sector use of AI has been quietly on the rise for the past decade, but only recently have efforts to regulate it entered the cultural zeitgeist. While simple to articulate, promoting ethical and effective roll outs of AI systems in government is a notoriously elusive task. On the one hand there are hard-to-address pitfalls associated with AI-based tools, including concerns about bias toward… ▽ More Public sector use of AI has been quietly on the rise for the past decade, but only recently have efforts to regulate it entered the cultural zeitgeist. While simple to articulate, promoting ethical and effective roll outs of AI systems in government is a notoriously elusive task. On the one hand there are hard-to-address pitfalls associated with AI-based tools, including concerns about bias towards marginalized communities, safety, and gameability. On the other, there is pressure not to make it too difficult to adopt AI, especially in the public sector which typically has fewer resources than the private sector$\unicode{x2014}$conserving scarce government resources is often the draw of using AI-based tools in the first place. These tensions create a real risk that procedures built to ensure marginalized groups are not hurt by government use of AI will, in practice, be performative and ineffective. To inform the latest wave of regulatory efforts in the United States, we look to jurisdictions with mature regulations around government AI use. We report on lessons learned by officials in Brazil, Singapore and Canada, who have collectively implemented risk categories, disclosure requirements and assessments into the way they procure AI tools. In particular, we investigate two implemented checklists: the Canadian Directive on Automated Decision-Making (CDADM) and the World Economic Forum's AI Procurement in a Box (WEF). We detail three key pitfalls around expertise, risk frameworks and transparency, that can decrease the efficacy of regulations aimed at government AI use and suggest avenues for improvement. △ Less

Submitted 22 April, 2024; originally announced April 2024.

arXiv:2403.08941 [pdf, other]

Towards Model-Agnostic Posterior Approximation for Fast and Accurate Variational Autoencoders

Authors: Yaniv Yacoby, Weiwei Pan, Finale Doshi-Velez

Abstract: Inference for Variational Autoencoders (VAEs) consists of learning two models: (1) a generative model, which transforms a simple distribution over a latent space into the distribution over observed data, and (2) an inference model, which approximates the posterior of the latent codes given data. The two components are learned jointly via a lower bound to the generative model's log marginal likelih… ▽ More Inference for Variational Autoencoders (VAEs) consists of learning two models: (1) a generative model, which transforms a simple distribution over a latent space into the distribution over observed data, and (2) an inference model, which approximates the posterior of the latent codes given data. The two components are learned jointly via a lower bound to the generative model's log marginal likelihood. In early phases of joint training, the inference model poorly approximates the latent code posteriors. Recent work showed that this leads optimization to get stuck in local optima, negatively impacting the learned generative model. As such, recent work suggests ensuring a high-quality inference model via iterative training: maximizing the objective function relative to the inference model before every update to the generative model. Unfortunately, iterative training is inefficient, requiring heuristic criteria for reverting from iterative to joint training for speed. Here, we suggest an inference method that trains the generative and inference models independently. It approximates the posterior of the true model a priori; fixing this posterior approximation, we then maximize the lower bound relative to only the generative model. By conventional wisdom, this approach should rely on the true prior and likelihood of the true model to approximate its posterior (which are unknown). However, we show that we can compute a deterministic, model-agnostic posterior approximation (MAPA) of the true model's posterior. We then use MAPA to develop a proof-of-concept inference method. We present preliminary results on low-dimensional synthetic data that (1) MAPA captures the trend of the true posterior, and (2) our MAPA-based inference performs better density estimation with less computation than baselines. Lastly, we present a roadmap for scaling the MAPA-based inference method to high-dimensional data. △ Less

Submitted 12 June, 2024; v1 submitted 13 March, 2024; originally announced March 2024.

Comments: Accepted at the Workshop at the 6th Symposium on Advances in Approximate Bayesian Inference (AABI) 2024

arXiv:2402.17003 [pdf, other]

Monitoring Fidelity of Online Reinforcement Learning Algorithms in Clinical Trials

Authors: Anna L. Trella, Kelly W. Zhang, Inbal Nahum-Shani, Vivek Shetty, Iris Yan, Finale Doshi-Velez, Susan A. Murphy

Abstract: Online reinforcement learning (RL) algorithms offer great potential for personalizing treatment for participants in clinical trials. However, deploying an online, autonomous algorithm in the high-stakes healthcare setting makes quality control and data quality especially difficult to achieve. This paper proposes algorithm fidelity as a critical requirement for deploying online RL algorithms in cli… ▽ More Online reinforcement learning (RL) algorithms offer great potential for personalizing treatment for participants in clinical trials. However, deploying an online, autonomous algorithm in the high-stakes healthcare setting makes quality control and data quality especially difficult to achieve. This paper proposes algorithm fidelity as a critical requirement for deploying online RL algorithms in clinical trials. It emphasizes the responsibility of the algorithm to (1) safeguard participants and (2) preserve the scientific utility of the data for post-trial analyses. We also present a framework for pre-deployment planning and real-time monitoring to help algorithm developers and clinical researchers ensure algorithm fidelity. To illustrate our framework's practical application, we present real-world examples from the Oralytics clinical trial. Since Spring 2023, this trial successfully deployed an autonomous, online RL algorithm to personalize behavioral interventions for participants at risk for dental disease. △ Less

Submitted 12 August, 2024; v1 submitted 26 February, 2024; originally announced February 2024.

arXiv:2402.12737 [pdf, other]

Guarantee Regions for Local Explanations

Authors: Marton Havasi, Sonali Parbhoo, Finale Doshi-Velez

Abstract: Interpretability methods that utilise local surrogate models (e.g. LIME) are very good at describing the behaviour of the predictive model at a point of interest, but they are not guaranteed to extrapolate to the local region surrounding the point. However, overfitting to the local curvature of the predictive model and malicious tampering can significantly limit extrapolation. We propose an anchor… ▽ More Interpretability methods that utilise local surrogate models (e.g. LIME) are very good at describing the behaviour of the predictive model at a point of interest, but they are not guaranteed to extrapolate to the local region surrounding the point. However, overfitting to the local curvature of the predictive model and malicious tampering can significantly limit extrapolation. We propose an anchor-based algorithm for identifying regions in which local explanations are guaranteed to be correct by explicitly describing those intervals along which the input features can be trusted. Our method produces an interpretable feature-aligned box where the prediction of the local surrogate model is guaranteed to match the predictive model. We demonstrate that our algorithm can be used to find explanations with larger guarantee regions that better cover the data manifold compared to existing baselines. We also show how our method can identify misleading local explanations with significantly poorer guarantee regions. △ Less

Submitted 20 February, 2024; originally announced February 2024.

arXiv:2402.03110 [pdf, other]

Non-Stationary Latent Auto-Regressive Bandits

Authors: Anna L. Trella, Walter Dempsey, Finale Doshi-Velez, Susan A. Murphy

Abstract: We consider the stochastic multi-armed bandit problem with non-stationary rewards. We present a novel formulation of non-stationarity in the environment where changes in the mean reward of the arms over time are due to some unknown, latent, auto-regressive (AR) state of order $k$. We call this new environment the latent AR bandit. Different forms of the latent AR bandit appear in many real-world s… ▽ More We consider the stochastic multi-armed bandit problem with non-stationary rewards. We present a novel formulation of non-stationarity in the environment where changes in the mean reward of the arms over time are due to some unknown, latent, auto-regressive (AR) state of order $k$. We call this new environment the latent AR bandit. Different forms of the latent AR bandit appear in many real-world settings, especially in emerging scientific fields such as behavioral health or education where there are few mechanistic models of the environment. If the AR order $k$ is known, we propose an algorithm that achieves $\tilde{O}(k\sqrt{T})$ regret in this setting. Empirically, our algorithm outperforms standard UCB across multiple non-stationary environments, even if $k$ is mis-specified. △ Less

Submitted 12 August, 2024; v1 submitted 5 February, 2024; originally announced February 2024.

arXiv:2401.16419 [pdf, other]

Semi-parametric Expert Bayesian Network Learning with Gaussian Processes and Horseshoe Priors

Authors: Yidou Weng, Finale Doshi-Velez

Abstract: This paper proposes a model learning Semi-parametric relationships in an Expert Bayesian Network (SEBN) with linear parameter and structure constraints. We use Gaussian Processes and a Horseshoe prior to introduce minimal nonlinear components. To prioritize modifying the expert graph over adding new edges, we optimize differential Horseshoe scales. In real-world datasets with unknown truth, we gen… ▽ More This paper proposes a model learning Semi-parametric relationships in an Expert Bayesian Network (SEBN) with linear parameter and structure constraints. We use Gaussian Processes and a Horseshoe prior to introduce minimal nonlinear components. To prioritize modifying the expert graph over adding new edges, we optimize differential Horseshoe scales. In real-world datasets with unknown truth, we generate diverse graphs to accommodate user input, addressing identifiability issues and enhancing interpretability. Evaluation on synthetic and UCI Liver Disorders datasets, using metrics like structural Hamming Distance and test likelihood, demonstrates our models outperform state-of-the-art semi-parametric Bayesian Network model. △ Less

Submitted 29 January, 2024; originally announced January 2024.

Comments: 8 pages, 4 figures, AAAI-2024 workshops

arXiv:2401.14923 [pdf, other]

Reinforcement Learning Interventions on Boundedly Rational Human Agents in Frictionful Tasks

Authors: Eura Nofshin, Siddharth Swaroop, Weiwei Pan, Susan Murphy, Finale Doshi-Velez

Abstract: Many important behavior changes are frictionful; they require individuals to expend effort over a long period with little immediate gratification. Here, an artificial intelligence (AI) agent can provide personalized interventions to help individuals stick to their goals. In these settings, the AI agent must personalize rapidly (before the individual disengages) and interpretably, to help us unders… ▽ More Many important behavior changes are frictionful; they require individuals to expend effort over a long period with little immediate gratification. Here, an artificial intelligence (AI) agent can provide personalized interventions to help individuals stick to their goals. In these settings, the AI agent must personalize rapidly (before the individual disengages) and interpretably, to help us understand the behavioral interventions. In this paper, we introduce Behavior Model Reinforcement Learning (BMRL), a framework in which an AI agent intervenes on the parameters of a Markov Decision Process (MDP) belonging to a boundedly rational human agent. Our formulation of the human decision-maker as a planning agent allows us to attribute undesirable human policies (ones that do not lead to the goal) to their maladapted MDP parameters, such as an extremely low discount factor. Furthermore, we propose a class of tractable human models that captures fundamental behaviors in frictionful tasks. Introducing a notion of MDP equivalence specific to BMRL, we theoretically and empirically show that AI planning with our human models can lead to helpful policies on a wide range of more complex, ground-truth humans. △ Less

Submitted 26 January, 2024; originally announced January 2024.

Comments: In AAMAS 2024

arXiv:2312.09983 [pdf, other]

Toward Computationally Efficient Inverse Reinforcement Learning via Reward Shaping

Authors: Lauren H. Cooke, Harvey Klyne, Edwin Zhang, Cassidy Laidlaw, Milind Tambe, Finale Doshi-Velez

Abstract: Inverse reinforcement learning (IRL) is computationally challenging, with common approaches requiring the solution of multiple reinforcement learning (RL) sub-problems. This work motivates the use of potential-based reward shaping to reduce the computational burden of each RL sub-problem. This work serves as a proof-of-concept and we hope will inspire future developments towards computationally ef… ▽ More Inverse reinforcement learning (IRL) is computationally challenging, with common approaches requiring the solution of multiple reinforcement learning (RL) sub-problems. This work motivates the use of potential-based reward shaping to reduce the computational burden of each RL sub-problem. This work serves as a proof-of-concept and we hope will inspire future developments towards computationally efficient IRL. △ Less

Submitted 18 December, 2023; v1 submitted 15 December, 2023; originally announced December 2023.

arXiv:2309.11443 [pdf, other]

Signature Activation: A Sparse Signal View for Holistic Saliency

Authors: Jose Roberto Tello Ayala, Akl C. Fahed, Weiwei Pan, Eugene V. Pomerantsev, Patrick T. Ellinor, Anthony Philippakis, Finale Doshi-Velez

Abstract: The adoption of machine learning in healthcare calls for model transparency and explainability. In this work, we introduce Signature Activation, a saliency method that generates holistic and class-agnostic explanations for Convolutional Neural Network (CNN) outputs. Our method exploits the fact that certain kinds of medical images, such as angiograms, have clear foreground and background objects.… ▽ More The adoption of machine learning in healthcare calls for model transparency and explainability. In this work, we introduce Signature Activation, a saliency method that generates holistic and class-agnostic explanations for Convolutional Neural Network (CNN) outputs. Our method exploits the fact that certain kinds of medical images, such as angiograms, have clear foreground and background objects. We give theoretical explanation to justify our methods. We show the potential use of our method in clinical settings through evaluating its efficacy for aiding the detection of lesions in coronary angiograms. △ Less

Submitted 20 September, 2023; originally announced September 2023.

arXiv:2309.00254 [pdf, other]

Why do universal adversarial attacks work on large language models?: Geometry might be the answer

Authors: Varshini Subhash, Anna Bialas, Weiwei Pan, Finale Doshi-Velez

Abstract: Transformer based large language models with emergent capabilities are becoming increasingly ubiquitous in society. However, the task of understanding and interpreting their internal workings, in the context of adversarial attacks, remains largely unsolved. Gradient-based universal adversarial attacks have been shown to be highly effective on large language models and potentially dangerous due to… ▽ More Transformer based large language models with emergent capabilities are becoming increasingly ubiquitous in society. However, the task of understanding and interpreting their internal workings, in the context of adversarial attacks, remains largely unsolved. Gradient-based universal adversarial attacks have been shown to be highly effective on large language models and potentially dangerous due to their input-agnostic nature. This work presents a novel geometric perspective explaining universal adversarial attacks on large language models. By attacking the 117M parameter GPT-2 model, we find evidence indicating that universal adversarial triggers could be embedding vectors which merely approximate the semantic information in their adversarial training region. This hypothesis is supported by white-box model analysis comprising dimensionality reduction and similarity measurement of hidden representations. We believe this new geometric perspective on the underlying mechanism driving universal attacks could help us gain deeper insight into the internal workings and failure modes of LLMs, thus enabling their mitigation. △ Less

Submitted 1 September, 2023; originally announced September 2023.

Comments: 2nd AdvML Frontiers Workshop at 40th International Conference on Machine Learning, Honolulu, Hawaii, USA, 2023

arXiv:2308.05075 [pdf, other]

Bayesian Inverse Transition Learning for Offline Settings

Authors: Leo Benac, Sonali Parbhoo, Finale Doshi-Velez

Abstract: Offline Reinforcement learning is commonly used for sequential decision-making in domains such as healthcare and education, where the rewards are known and the transition dynamics $T$ must be estimated on the basis of batch data. A key challenge for all tasks is how to learn a reliable estimate of the transition dynamics $T$ that produce near-optimal policies that are safe enough so that they neve… ▽ More Offline Reinforcement learning is commonly used for sequential decision-making in domains such as healthcare and education, where the rewards are known and the transition dynamics $T$ must be estimated on the basis of batch data. A key challenge for all tasks is how to learn a reliable estimate of the transition dynamics $T$ that produce near-optimal policies that are safe enough so that they never take actions that are far away from the best action with respect to their value functions and informative enough so that they communicate the uncertainties they have. Using data from an expert, we propose a new constraint-based approach that captures our desiderata for reliably learning a posterior distribution of the transition dynamics $T$ that is free from gradients. Our results demonstrate that by using our constraints, we learn a high-performing policy, while considerably reducing the policy's variance over different datasets. We also explain how combining uncertainty estimation with these constraints can help us infer a partial ranking of actions that produce higher returns, and helps us infer safer and more informative policies for planning. △ Less

Submitted 9 August, 2023; originally announced August 2023.

Comments: 8 pages, 1 plots, 2 tables

arXiv:2308.01420 [pdf, other]

SAP-sLDA: An Interpretable Interface for Exploring Unstructured Text

Authors: Charumathi Badrinath, Weiwei Pan, Finale Doshi-Velez

Abstract: A common way to explore text corpora is through low-dimensional projections of the documents, where one hopes that thematically similar documents will be clustered together in the projected space. However, popular algorithms for dimensionality reduction of text corpora, like Latent Dirichlet Allocation (LDA), often produce projections that do not capture human notions of document similarity. We pr… ▽ More A common way to explore text corpora is through low-dimensional projections of the documents, where one hopes that thematically similar documents will be clustered together in the projected space. However, popular algorithms for dimensionality reduction of text corpora, like Latent Dirichlet Allocation (LDA), often produce projections that do not capture human notions of document similarity. We propose a semi-supervised human-in-the-loop LDA-based method for learning topics that preserve semantically meaningful relationships between documents in low-dimensional projections. On synthetic corpora, our method yields more interpretable projections than baseline methods with only a fraction of labels provided. On a real corpus, we obtain qualitatively similar results. △ Less

Submitted 28 July, 2023; originally announced August 2023.

arXiv:2307.08169 [pdf, other]

Discovering User Types: Mapping User Traits by Task-Specific Behaviors in Reinforcement Learning

Authors: L. L. Ankile, B. S. Ham, K. Mao, E. Shin, S. Swaroop, F. Doshi-Velez, W. Pan

Abstract: When assisting human users in reinforcement learning (RL), we can represent users as RL agents and study key parameters, called \emph{user traits}, to inform intervention design. We study the relationship between user behaviors (policy classes) and user traits. Given an environment, we introduce an intuitive tool for studying the breakdown of "user types": broad sets of traits that result in the s… ▽ More When assisting human users in reinforcement learning (RL), we can represent users as RL agents and study key parameters, called \emph{user traits}, to inform intervention design. We study the relationship between user behaviors (policy classes) and user traits. Given an environment, we introduce an intuitive tool for studying the breakdown of "user types": broad sets of traits that result in the same behavior. We show that seemingly different real-world environments admit the same set of user types and formalize this observation as an equivalence relation defined on environments. By transferring intervention design between environments within the same equivalence class, we can help rapidly personalize interventions. △ Less

Submitted 16 July, 2023; originally announced July 2023.

arXiv:2307.06541 [pdf, other]

On the Effective Horizon of Inverse Reinforcement Learning

Authors: Yiqing Xu, Finale Doshi-Velez, David Hsu

Abstract: Inverse reinforcement learning (IRL) algorithms often rely on (forward) reinforcement learning or planning over a given time horizon to compute an approximately optimal policy for a hypothesized reward function and then match this policy with expert demonstrations. The time horizon plays a critical role in determining both the accuracy of reward estimate and the computational efficiency of IRL alg… ▽ More Inverse reinforcement learning (IRL) algorithms often rely on (forward) reinforcement learning or planning over a given time horizon to compute an approximately optimal policy for a hypothesized reward function and then match this policy with expert demonstrations. The time horizon plays a critical role in determining both the accuracy of reward estimate and the computational efficiency of IRL algorithms. Interestingly, an effective time horizon shorter than the ground-truth value often produces better results faster. This work formally analyzes this phenomenon and provides an explanation: the time horizon controls the complexity of an induced policy class and mitigates overfitting with limited data. This analysis leads to a principled choice of the effective horizon for IRL. It also prompts us to reexamine the classic IRL formulation: it is more natural to learn jointly the reward and the effective horizon together rather than the reward alone with a given horizon. Our experimental results confirm the theoretical analysis. △ Less

Submitted 12 July, 2023; originally announced July 2023.

Comments: 9 pages, under review

arXiv:2306.12609 [pdf, other]

Towards Regulatable AI Systems: Technical Gaps and Policy Opportunities

Authors: Xudong Shen, Hannah Brown, Jiashu Tao, Martin Strobel, Yao Tong, Akshay Narayan, Harold Soh, Finale Doshi-Velez

Abstract: There is increasing attention being given to how to regulate AI systems. As governing bodies grapple with what values to encapsulate into regulation, we consider the technical half of the question: To what extent can AI experts vet an AI system for adherence to regulatory requirements? We investigate this question through the lens of two public sector procurement checklists, identifying what we ca… ▽ More There is increasing attention being given to how to regulate AI systems. As governing bodies grapple with what values to encapsulate into regulation, we consider the technical half of the question: To what extent can AI experts vet an AI system for adherence to regulatory requirements? We investigate this question through the lens of two public sector procurement checklists, identifying what we can do now, what should be possible with technical innovation, and what requirements need a more interdisciplinary approach. △ Less

Submitted 27 March, 2024; v1 submitted 21 June, 2023; originally announced June 2023.

Comments: scheduled for publication in the Communications of the ACM, titled "Directions of Technical Innovation for Regulatable AI Systems"

arXiv:2306.11208 [pdf, other]

The Unintended Consequences of Discount Regularization: Improving Regularization in Certainty Equivalence Reinforcement Learning

Authors: Sarah Rathnam, Sonali Parbhoo, Weiwei Pan, Susan A. Murphy, Finale Doshi-Velez

Abstract: Discount regularization, using a shorter planning horizon when calculating the optimal policy, is a popular choice to restrict planning to a less complex set of policies when estimating an MDP from sparse or noisy data (Jiang et al., 2015). It is commonly understood that discount regularization functions by de-emphasizing or ignoring delayed effects. In this paper, we reveal an alternate view of d… ▽ More Discount regularization, using a shorter planning horizon when calculating the optimal policy, is a popular choice to restrict planning to a less complex set of policies when estimating an MDP from sparse or noisy data (Jiang et al., 2015). It is commonly understood that discount regularization functions by de-emphasizing or ignoring delayed effects. In this paper, we reveal an alternate view of discount regularization that exposes unintended consequences. We demonstrate that planning under a lower discount factor produces an identical optimal policy to planning using any prior on the transition matrix that has the same distribution for all states and actions. In fact, it functions like a prior with stronger regularization on state-action pairs with more transition data. This leads to poor performance when the transition matrix is estimated from data sets with uneven amounts of data across state-action pairs. Our equivalence theorem leads to an explicit formula to set regularization parameters locally for individual state-action pairs rather than globally. We demonstrate the failures of discount regularization and how we remedy them using our state-action-specific method across simple empirical examples as well as a medical cancer simulator. △ Less

Submitted 19 June, 2023; originally announced June 2023.

arXiv:2306.07458 [pdf, other]

doi 10.1145/3640543.3645206

Accuracy-Time Tradeoffs in AI-Assisted Decision Making under Time Pressure

Authors: Siddharth Swaroop, Zana Buçinca, Krzysztof Z. Gajos, Finale Doshi-Velez

Abstract: In settings where users both need high accuracy and are time-pressured, such as doctors working in emergency rooms, we want to provide AI assistance that both increases decision accuracy and reduces decision-making time. Current literature focusses on how users interact with AI assistance when there is no time pressure, finding that different AI assistances have different benefits: some can reduce… ▽ More In settings where users both need high accuracy and are time-pressured, such as doctors working in emergency rooms, we want to provide AI assistance that both increases decision accuracy and reduces decision-making time. Current literature focusses on how users interact with AI assistance when there is no time pressure, finding that different AI assistances have different benefits: some can reduce time taken while increasing overreliance on AI, while others do the opposite. The precise benefit can depend on both the user and task. In time-pressured scenarios, adapting when we show AI assistance is especially important: relying on the AI assistance can save time, and can therefore be beneficial when the AI is likely to be right. We would ideally adapt what AI assistance we show depending on various properties (of the task and of the user) in order to best trade off accuracy and time. We introduce a study where users have to answer a series of logic puzzles. We find that time pressure affects how users use different AI assistances, making some assistances more beneficial than others when compared to no-time-pressure settings. We also find that a user's overreliance rate is a key predictor of their behaviour: overreliers and not-overreliers use different AI assistance types differently. We find marginal correlations between a user's overreliance rate (which is related to the user's trust in AI recommendations) and their personality traits (Big Five Personality traits). Overall, our work suggests that AI assistances have different accuracy-time tradeoffs when people are under time pressure compared to no time pressure, and we explore how we might adapt AI assistances in this setting. △ Less

Submitted 11 February, 2024; v1 submitted 12 June, 2023; originally announced June 2023.

arXiv:2305.01738 [pdf, other]

Leveraging Factored Action Spaces for Efficient Offline Reinforcement Learning in Healthcare

Authors: Shengpu Tang, Maggie Makar, Michael W. Sjoding, Finale Doshi-Velez, Jenna Wiens

Abstract: Many reinforcement learning (RL) applications have combinatorial action spaces, where each action is a composition of sub-actions. A standard RL approach ignores this inherent factorization structure, resulting in a potential failure to make meaningful inferences about rarely observed sub-action combinations; this is particularly problematic for offline settings, where data may be limited. In this… ▽ More Many reinforcement learning (RL) applications have combinatorial action spaces, where each action is a composition of sub-actions. A standard RL approach ignores this inherent factorization structure, resulting in a potential failure to make meaningful inferences about rarely observed sub-action combinations; this is particularly problematic for offline settings, where data may be limited. In this work, we propose a form of linear Q-function decomposition induced by factored action spaces. We study the theoretical properties of our approach, identifying scenarios where it is guaranteed to lead to zero bias when used to approximate the Q-function. Outside the regimes with theoretical guarantees, we show that our approach can still be useful because it leads to better sample efficiency without necessarily sacrificing policy optimality, allowing us to achieve a better bias-variance trade-off. Across several offline RL problems using simulators and real-world datasets motivated by healthcare, we demonstrate that incorporating factored action spaces into value-based RL can result in better-performing policies. Our approach can help an agent make more accurate inferences within underexplored regions of the state-action space when applying RL to observational datasets. △ Less

Submitted 2 May, 2023; originally announced May 2023.

Comments: 30 pages, 18 figures, 2 tables. NeurIPS 2022. Code available at https://github.com/MLD3/OfflineRL_FactoredActions

arXiv:2304.03365 [pdf, other]

Decision-Focused Model-based Reinforcement Learning for Reward Transfer

Authors: Abhishek Sharma, Sonali Parbhoo, Omer Gottesman, Finale Doshi-Velez

Abstract: Decision-focused (DF) model-based reinforcement learning has recently been introduced as a powerful algorithm that can focus on learning the MDP dynamics that are most relevant for obtaining high returns. While this approach increases the agent's performance by directly optimizing the reward, it does so by learning less accurate dynamics from a maximum likelihood perspective. We demonstrate that w… ▽ More Decision-focused (DF) model-based reinforcement learning has recently been introduced as a powerful algorithm that can focus on learning the MDP dynamics that are most relevant for obtaining high returns. While this approach increases the agent's performance by directly optimizing the reward, it does so by learning less accurate dynamics from a maximum likelihood perspective. We demonstrate that when the reward function is defined by preferences over multiple objectives, the DF model may be sensitive to changes in the objective preferences.In this work, we develop the robust decision-focused (RDF) algorithm, which leverages the non-identifiability of DF solutions to learn models that maximize expected returns while simultaneously learning models that transfer to changes in the preference over multiple objectives. We demonstrate the effectiveness of RDF on two synthetic domains and two healthcare simulators, showing that it significantly improves the robustness of DF model learning to changes in the reward function without compromising training-time return. △ Less

Submitted 1 January, 2024; v1 submitted 6 April, 2023; originally announced April 2023.

arXiv:2212.00863 [pdf, other]

Modeling Mobile Health Users as Reinforcement Learning Agents

Authors: Eura Shin, Siddharth Swaroop, Weiwei Pan, Susan Murphy, Finale Doshi-Velez

Abstract: Mobile health (mHealth) technologies empower patients to adopt/maintain healthy behaviors in their daily lives, by providing interventions (e.g. push notifications) tailored to the user's needs. In these settings, without intervention, human decision making may be impaired (e.g. valuing near term pleasure over own long term goals). In this work, we formalize this relationship with a framework in w… ▽ More Mobile health (mHealth) technologies empower patients to adopt/maintain healthy behaviors in their daily lives, by providing interventions (e.g. push notifications) tailored to the user's needs. In these settings, without intervention, human decision making may be impaired (e.g. valuing near term pleasure over own long term goals). In this work, we formalize this relationship with a framework in which the user optimizes a (potentially impaired) Markov Decision Process (MDP) and the mHealth agent intervenes on the user's MDP parameters. We show that different types of impairments imply different types of optimal intervention. We also provide analytical and empirical explorations of these differences. △ Less

Submitted 1 December, 2022; originally announced December 2022.

arXiv:2211.09184 [pdf, other]

An Empirical Analysis of the Advantages of Finite- v.s. Infinite-Width Bayesian Neural Networks

Authors: Jiayu Yao, Yaniv Yacoby, Beau Coker, Weiwei Pan, Finale Doshi-Velez

Abstract: Comparing Bayesian neural networks (BNNs) with different widths is challenging because, as the width increases, multiple model properties change simultaneously, and, inference in the finite-width case is intractable. In this work, we empirically compare finite- and infinite-width BNNs, and provide quantitative and qualitative explanations for their performance difference. We find that when the mod… ▽ More Comparing Bayesian neural networks (BNNs) with different widths is challenging because, as the width increases, multiple model properties change simultaneously, and, inference in the finite-width case is intractable. In this work, we empirically compare finite- and infinite-width BNNs, and provide quantitative and qualitative explanations for their performance difference. We find that when the model is mis-specified, increasing width can hurt BNN performance. In these cases, we provide evidence that finite-width BNNs generalize better partially due to the properties of their frequency spectrum that allows them to adapt under model mismatch. △ Less

Submitted 28 November, 2022; v1 submitted 16 November, 2022; originally announced November 2022.

arXiv:2211.07719 [pdf, other]

(When) Are Contrastive Explanations of Reinforcement Learning Helpful?

Authors: Sanjana Narayanan, Isaac Lage, Finale Doshi-Velez

Abstract: Global explanations of a reinforcement learning (RL) agent's expected behavior can make it safer to deploy. However, such explanations are often difficult to understand because of the complicated nature of many RL policies. Effective human explanations are often contrastive, referencing a known contrast (policy) to reduce redundancy. At the same time, these explanations also require the additional… ▽ More Global explanations of a reinforcement learning (RL) agent's expected behavior can make it safer to deploy. However, such explanations are often difficult to understand because of the complicated nature of many RL policies. Effective human explanations are often contrastive, referencing a known contrast (policy) to reduce redundancy. At the same time, these explanations also require the additional effort of referencing that contrast when evaluating an explanation. We conduct a user study to understand whether and when contrastive explanations might be preferable to complete explanations that do not require referencing a contrast. We find that complete explanations are generally more effective when they are the same size or smaller than a contrastive explanation of the same policy, and no worse when they are larger. This suggests that contrastive explanations are not sufficient to solve the problem of effectively explaining reinforcement learning policies, and require additional careful study for use in this context. △ Less

Submitted 14 November, 2022; originally announced November 2022.

Comments: Accepted to NeurIPS 2022 workshop on Human in the Loop Learning

arXiv:2211.05667 [pdf, ps, other]

What Makes a Good Explanation?: A Harmonized View of Properties of Explanations

Authors: Zixi Chen, Varshini Subhash, Marton Havasi, Weiwei Pan, Finale Doshi-Velez

Abstract: Interpretability provides a means for humans to verify aspects of machine learning (ML) models and empower human+ML teaming in situations where the task cannot be fully automated. Different contexts require explanations with different properties. For example, the kind of explanation required to determine if an early cardiac arrest warning system is ready to be integrated into a care setting is ver… ▽ More Interpretability provides a means for humans to verify aspects of machine learning (ML) models and empower human+ML teaming in situations where the task cannot be fully automated. Different contexts require explanations with different properties. For example, the kind of explanation required to determine if an early cardiac arrest warning system is ready to be integrated into a care setting is very different from the type of explanation required for a loan applicant to help determine the actions they might need to take to make their application successful. Unfortunately, there is a lack of standardization when it comes to properties of explanations: different papers may use the same term to mean different quantities, and different terms to mean the same quantity. This lack of a standardized terminology and categorization of the properties of ML explanations prevents us from both rigorously comparing interpretable machine learning methods and identifying what properties are needed in what contexts. In this work, we survey properties defined in interpretable machine learning papers, synthesize them based on what they actually measure, and describe the trade-offs between different formulations of these properties. In doing so, we enable more informed selection of task-appropriate formulations of explanation properties as well as standardization for future work in interpretable machine learning. △ Less

Submitted 12 July, 2024; v1 submitted 10 November, 2022; originally announced November 2022.

Comments: Short version accepted at NeurIPS 2022 workshops on Progress and Challenges in Building Trustworthy Embodied AI and Trustworthy and Socially Responsible Machine Learning

arXiv:2210.15767 [pdf]

Gathering Strength, Gathering Storms: The One Hundred Year Study on Artificial Intelligence (AI100) 2021 Study Panel Report

Authors: Michael L. Littman, Ifeoma Ajunwa, Guy Berger, Craig Boutilier, Morgan Currie, Finale Doshi-Velez, Gillian Hadfield, Michael C. Horowitz, Charles Isbell, Hiroaki Kitano, Karen Levy, Terah Lyons, Melanie Mitchell, Julie Shah, Steven Sloman, Shannon Vallor, Toby Walsh

Abstract: In September 2021, the "One Hundred Year Study on Artificial Intelligence" project (AI100) issued the second report of its planned long-term periodic assessment of artificial intelligence (AI) and its impact on society. It was written by a panel of 17 study authors, each of whom is deeply rooted in AI research, chaired by Michael Littman of Brown University. The report, entitled "Gathering Strengt… ▽ More In September 2021, the "One Hundred Year Study on Artificial Intelligence" project (AI100) issued the second report of its planned long-term periodic assessment of artificial intelligence (AI) and its impact on society. It was written by a panel of 17 study authors, each of whom is deeply rooted in AI research, chaired by Michael Littman of Brown University. The report, entitled "Gathering Strength, Gathering Storms," answers a set of 14 questions probing critical areas of AI development addressing the major risks and dangers of AI, its effects on society, its public perception and the future of the field. The report concludes that AI has made a major leap from the lab to people's lives in recent years, which increases the urgency to understand its potential negative effects. The questions were developed by the AI100 Standing Committee, chaired by Peter Stone of the University of Texas at Austin, consisting of a group of AI leaders with expertise in computer science, sociology, ethics, economics, and other disciplines. △ Less

Submitted 27 October, 2022; originally announced October 2022.

Comments: 82 pages, https://ai100.stanford.edu/gathering-strength-gathering-storms-one-hundred-year-study-artificial-intelligence-ai100-2021-study

arXiv:2209.08682 [pdf, other]

doi 10.1145/3514094.3534198

Towards Robust Off-Policy Evaluation via Human Inputs

Authors: Harvineet Singh, Shalmali Joshi, Finale Doshi-Velez, Himabindu Lakkaraju

Abstract: Off-policy Evaluation (OPE) methods are crucial tools for evaluating policies in high-stakes domains such as healthcare, where direct deployment is often infeasible, unethical, or expensive. When deployment environments are expected to undergo changes (that is, dataset shifts), it is important for OPE methods to perform robust evaluation of the policies amidst such changes. Existing approaches con… ▽ More Off-policy Evaluation (OPE) methods are crucial tools for evaluating policies in high-stakes domains such as healthcare, where direct deployment is often infeasible, unethical, or expensive. When deployment environments are expected to undergo changes (that is, dataset shifts), it is important for OPE methods to perform robust evaluation of the policies amidst such changes. Existing approaches consider robustness against a large class of shifts that can arbitrarily change any observable property of the environment. This often results in highly pessimistic estimates of the utilities, thereby invalidating policies that might have been useful in deployment. In this work, we address the aforementioned problem by investigating how domain knowledge can help provide more realistic estimates of the utilities of policies. We leverage human inputs on which aspects of the environments may plausibly change, and adapt the OPE methods to only consider shifts on these aspects. Specifically, we propose a novel framework, Robust OPE (ROPE), which considers shifts on a subset of covariates in the data based on user inputs, and estimates worst-case utility under these shifts. We then develop computationally efficient algorithms for OPE that are robust to the aforementioned shifts for contextual bandits and Markov decision processes. We also theoretically analyze the sample complexity of these algorithms. Extensive experimentation with synthetic and real world datasets from the healthcare domain demonstrates that our approach not only captures realistic dataset shifts accurately, but also results in less pessimistic policy evaluations. △ Less

Submitted 18 September, 2022; originally announced September 2022.

Comments: 10 pages, 5 figures, 1 table. Appeared at AIES '22: Proceedings of the 2022 AAAI/ACM Conference on AI, Ethics, and Society. Expanded version of arXiv:2103.15933

arXiv:2208.07406 [pdf, other]

Reward Design For An Online Reinforcement Learning Algorithm Supporting Oral Self-Care

Authors: Anna L. Trella, Kelly W. Zhang, Inbal Nahum-Shani, Vivek Shetty, Finale Doshi-Velez, Susan A. Murphy

Abstract: Dental disease is one of the most common chronic diseases despite being largely preventable. However, professional advice on optimal oral hygiene practices is often forgotten or abandoned by patients. Therefore patients may benefit from timely and personalized encouragement to engage in oral self-care behaviors. In this paper, we develop an online reinforcement learning (RL) algorithm for use in o… ▽ More Dental disease is one of the most common chronic diseases despite being largely preventable. However, professional advice on optimal oral hygiene practices is often forgotten or abandoned by patients. Therefore patients may benefit from timely and personalized encouragement to engage in oral self-care behaviors. In this paper, we develop an online reinforcement learning (RL) algorithm for use in optimizing the delivery of mobile-based prompts to encourage oral hygiene behaviors. One of the main challenges in developing such an algorithm is ensuring that the algorithm considers the impact of the current action on the effectiveness of future actions (i.e., delayed effects), especially when the algorithm has been made simple in order to run stably and autonomously in a constrained, real-world setting (i.e., highly noisy, sparse data). We address this challenge by designing a quality reward which maximizes the desired health outcome (i.e., high-quality brushing) while minimizing user burden. We also highlight a procedure for optimizing the hyperparameters of the reward by building a simulation environment test bed and evaluating candidates using the test bed. The RL algorithm discussed in this paper will be deployed in Oralytics, an oral self-care app that provides behavioral strategies to boost patient engagement in oral hygiene practices. △ Less

Submitted 14 September, 2022; v1 submitted 15 August, 2022; originally announced August 2022.

arXiv:2208.01705 [pdf, other]

Success of Uncertainty-Aware Deep Models Depends on Data Manifold Geometry

Authors: Mark Penrod, Harrison Termotto, Varshini Reddy, Jiayu Yao, Finale Doshi-Velez, Weiwei Pan

Abstract: For responsible decision making in safety-critical settings, machine learning models must effectively detect and process edge-case data. Although existing works show that predictive uncertainty is useful for these tasks, it is not evident from literature which uncertainty-aware models are best suited for a given dataset. Thus, we compare six uncertainty-aware deep learning models on a set of edge-… ▽ More For responsible decision making in safety-critical settings, machine learning models must effectively detect and process edge-case data. Although existing works show that predictive uncertainty is useful for these tasks, it is not evident from literature which uncertainty-aware models are best suited for a given dataset. Thus, we compare six uncertainty-aware deep learning models on a set of edge-case tasks: robustness to adversarial attacks as well as out-of-distribution and adversarial detection. We find that the geometry of the data sub-manifold is an important factor in determining the success of various models. Our finding suggests an interesting direction in the study of uncertainty-aware deep learning models. △ Less

Submitted 5 August, 2022; v1 submitted 2 August, 2022; originally announced August 2022.

ACM Class: I.2.6

Journal ref: International Conference on Machine Learning. PMLR 162 (2022)

arXiv:2208.00250 [pdf, other]

A Bayesian Approach to Learning Bandit Structure in Markov Decision Processes

Authors: Kelly W. Zhang, Omer Gottesman, Finale Doshi-Velez

Abstract: In the reinforcement learning literature, there are many algorithms developed for either Contextual Bandit (CB) or Markov Decision Processes (MDP) environments. However, when deploying reinforcement learning algorithms in the real world, even with domain expertise, it is often difficult to know whether it is appropriate to treat a sequential decision making problem as a CB or an MDP. In other word… ▽ More In the reinforcement learning literature, there are many algorithms developed for either Contextual Bandit (CB) or Markov Decision Processes (MDP) environments. However, when deploying reinforcement learning algorithms in the real world, even with domain expertise, it is often difficult to know whether it is appropriate to treat a sequential decision making problem as a CB or an MDP. In other words, do actions affect future states, or only the immediate rewards? Making the wrong assumption regarding the nature of the environment can lead to inefficient learning, or even prevent the algorithm from ever learning an optimal policy, even with infinite data. In this work we develop an online algorithm that uses a Bayesian hypothesis testing approach to learn the nature of the environment. Our algorithm allows practitioners to incorporate prior knowledge about whether the environment is that of a CB or an MDP, and effectively interpolate between classical CB and MDP-based algorithms to mitigate against the effects of misspecifying the environment. We perform simulations and demonstrate that in CB settings our algorithm achieves lower regret than MDP-based algorithms, while in non-bandit MDP settings our algorithm is able to learn the optimal policy, often achieving comparable regret to MDP-based algorithms. △ Less

Submitted 30 July, 2022; originally announced August 2022.

Comments: Challenges of Real-World Reinforcement Learning 2020 (NeurIPS Workshop)

arXiv:2207.06269 [pdf, other]

Policy Optimization with Sparse Global Contrastive Explanations

Authors: Jiayu Yao, Sonali Parbhoo, Weiwei Pan, Finale Doshi-Velez

Abstract: We develop a Reinforcement Learning (RL) framework for improving an existing behavior policy via sparse, user-interpretable changes. Our goal is to make minimal changes while gaining as much benefit as possible. We define a minimal change as having a sparse, global contrastive explanation between the original and proposed policy. We improve the current policy with the constraint of keeping that gl… ▽ More We develop a Reinforcement Learning (RL) framework for improving an existing behavior policy via sparse, user-interpretable changes. Our goal is to make minimal changes while gaining as much benefit as possible. We define a minimal change as having a sparse, global contrastive explanation between the original and proposed policy. We improve the current policy with the constraint of keeping that global contrastive explanation short. We demonstrate our framework with a discrete MDP and a continuous 2D navigation domain. △ Less

Submitted 13 July, 2022; originally announced July 2022.

Comments: Accepted at IMLH Workshop, ICML 2022

arXiv:2206.10847 [pdf, other]

Connecting Algorithmic Research and Usage Contexts: A Perspective of Contextualized Evaluation for Explainable AI

Authors: Q. Vera Liao, Yunfeng Zhang, Ronny Luss, Finale Doshi-Velez, Amit Dhurandhar

Abstract: Recent years have seen a surge of interest in the field of explainable AI (XAI), with a plethora of algorithms proposed in the literature. However, a lack of consensus on how to evaluate XAI hinders the advancement of the field. We highlight that XAI is not a monolithic set of technologies -- researchers and practitioners have begun to leverage XAI algorithms to build XAI systems that serve differ… ▽ More Recent years have seen a surge of interest in the field of explainable AI (XAI), with a plethora of algorithms proposed in the literature. However, a lack of consensus on how to evaluate XAI hinders the advancement of the field. We highlight that XAI is not a monolithic set of technologies -- researchers and practitioners have begun to leverage XAI algorithms to build XAI systems that serve different usage contexts, such as model debugging and decision-support. Algorithmic research of XAI, however, often does not account for these diverse downstream usage contexts, resulting in limited effectiveness or even unintended consequences for actual users, as well as difficulties for practitioners to make technical choices. We argue that one way to close the gap is to develop evaluation methods that account for different user requirements in these usage contexts. Towards this goal, we introduce a perspective of contextualized XAI evaluation by considering the relative importance of XAI evaluation criteria for prototypical usage contexts of XAI. To explore the context dependency of XAI evaluation criteria, we conduct two survey studies, one with XAI topical experts and another with crowd workers. Our results urge for responsible AI research with usage-informed evaluation practices, and provide a nuanced understanding of user requirements for XAI in different usage contexts. △ Less

Submitted 20 September, 2022; v1 submitted 22 June, 2022; originally announced June 2022.

Comments: Forthcoming for AAAI HCOMP 2022

arXiv:2206.03944 [pdf, other]

doi 10.3390/a15080255

Designing Reinforcement Learning Algorithms for Digital Interventions: Pre-implementation Guidelines

Authors: Anna L. Trella, Kelly W. Zhang, Inbal Nahum-Shani, Vivek Shetty, Finale Doshi-Velez, Susan A. Murphy

Abstract: Online reinforcement learning (RL) algorithms are increasingly used to personalize digital interventions in the fields of mobile health and online education. Common challenges in designing and testing an RL algorithm in these settings include ensuring the RL algorithm can learn and run stably under real-time constraints, and accounting for the complexity of the environment, e.g., a lack of accurat… ▽ More Online reinforcement learning (RL) algorithms are increasingly used to personalize digital interventions in the fields of mobile health and online education. Common challenges in designing and testing an RL algorithm in these settings include ensuring the RL algorithm can learn and run stably under real-time constraints, and accounting for the complexity of the environment, e.g., a lack of accurate mechanistic models for the user dynamics. To guide how one can tackle these challenges, we extend the PCS (Predictability, Computability, Stability) framework, a data science framework that incorporates best practices from machine learning and statistics in supervised learning (Yu and Kumbier, 2020), to the design of RL algorithms for the digital interventions setting. Further, we provide guidelines on how to design simulation environments, a crucial tool for evaluating RL candidate algorithms using the PCS framework. We illustrate the use of the PCS framework for designing an RL algorithm for Oralytics, a mobile health study aiming to improve users' tooth-brushing behaviors through the personalized delivery of intervention messages. Oralytics will go into the field in late 2022. △ Less

Submitted 18 August, 2022; v1 submitted 8 June, 2022; originally announced June 2022.

arXiv:2204.03208 [pdf, other]

A Joint Learning Approach for Semi-supervised Neural Topic Modeling

Authors: Jeffrey Chiu, Rajat Mittal, Neehal Tumma, Abhishek Sharma, Finale Doshi-Velez

Abstract: Topic models are some of the most popular ways to represent textual data in an interpret-able manner. Recently, advances in deep generative models, specifically auto-encoding variational Bayes (AEVB), have led to the introduction of unsupervised neural topic models, which leverage deep generative models as opposed to traditional statistics-based topic models. We extend upon these neural topic mode… ▽ More Topic models are some of the most popular ways to represent textual data in an interpret-able manner. Recently, advances in deep generative models, specifically auto-encoding variational Bayes (AEVB), have led to the introduction of unsupervised neural topic models, which leverage deep generative models as opposed to traditional statistics-based topic models. We extend upon these neural topic models by introducing the Label-Indexed Neural Topic Model (LI-NTM), which is, to the extent of our knowledge, the first effective upstream semi-supervised neural topic model. We find that LI-NTM outperforms existing neural topic models in document reconstruction benchmarks, with the most notable results in low labeled data regimes and for data-sets with informative labels; furthermore, our jointly learned classifier outperforms baseline classifiers in ablation studies. △ Less

Submitted 7 April, 2022; originally announced April 2022.

Comments: To appear in the 6th ACL Workshop on Structured Prediction for NLP (SPNLP)

arXiv:2202.11670 [pdf, other]

Wide Mean-Field Bayesian Neural Networks Ignore the Data

Authors: Beau Coker, Wessel P. Bruinsma, David R. Burt, Weiwei Pan, Finale Doshi-Velez

Abstract: Bayesian neural networks (BNNs) combine the expressive power of deep learning with the advantages of Bayesian formalism. In recent years, the analysis of wide, deep BNNs has provided theoretical insight into their priors and posteriors. However, we have no analogous insight into their posteriors under approximate inference. In this work, we show that mean-field variational inference entirely fails… ▽ More Bayesian neural networks (BNNs) combine the expressive power of deep learning with the advantages of Bayesian formalism. In recent years, the analysis of wide, deep BNNs has provided theoretical insight into their priors and posteriors. However, we have no analogous insight into their posteriors under approximate inference. In this work, we show that mean-field variational inference entirely fails to model the data when the network width is large and the activation function is odd. Specifically, for fully-connected BNNs with odd activation functions and a homoscedastic Gaussian likelihood, we show that the optimal mean-field variational posterior predictive (i.e., function space) distribution converges to the prior predictive distribution as the width tends to infinity. We generalize aspects of this result to other likelihoods. Our theoretical results are suggestive of underfitting behavior previously observered in BNNs. While our convergence bounds are non-asymptotic and constants in our analysis can be computed, they are currently too loose to be applicable in standard training regimes. Finally, we show that the optimal approximate posterior need not tend to the prior if the activation function is not odd, showing that our statements cannot be generalized arbitrarily. △ Less

Submitted 23 February, 2022; originally announced February 2022.

arXiv:2201.08262 [pdf, other]

Generalizing Off-Policy Evaluation From a Causal Perspective For Sequential Decision-Making

Authors: Sonali Parbhoo, Shalmali Joshi, Finale Doshi-Velez

Abstract: Assessing the effects of a policy based on observational data from a different policy is a common problem across several high-stake decision-making domains, and several off-policy evaluation (OPE) techniques have been proposed. However, these methods largely formulate OPE as a problem disassociated from the process used to generate the data (i.e. structural assumptions in the form of a causal grap… ▽ More Assessing the effects of a policy based on observational data from a different policy is a common problem across several high-stake decision-making domains, and several off-policy evaluation (OPE) techniques have been proposed. However, these methods largely formulate OPE as a problem disassociated from the process used to generate the data (i.e. structural assumptions in the form of a causal graph). We argue that explicitly highlighting this association has important implications on our understanding of the fundamental limits of OPE. First, this implies that current formulation of OPE corresponds to a narrow set of tasks, i.e. a specific causal estimand which is focused on prospective evaluation of policies over populations or sub-populations. Second, we demonstrate how this association motivates natural desiderata to consider a general set of causal estimands, particularly extending the role of OPE for counterfactual off-policy evaluation at the level of individuals of the population. A precise description of the causal estimand highlights which OPE estimands are identifiable from observational data under the stated generative assumptions. For those OPE estimands that are not identifiable, the causal perspective further highlights where more experimental data is necessary, and highlights situations where human expertise can aid identification and estimation. Furthermore, many formalisms of OPE overlook the role of uncertainty entirely in the estimation process.We demonstrate how specifically characterising the causal estimand highlights the different sources of uncertainty and when human expertise can naturally manage this uncertainty. We discuss each of these aspects as actionable desiderata for future OPE research at scale and in-line with practical utility. △ Less

Submitted 20 January, 2022; originally announced January 2022.

arXiv:2111.14272 [pdf, other]

Identification of Subgroups With Similar Benefits in Off-Policy Policy Evaluation

Authors: Ramtin Keramati, Omer Gottesman, Leo Anthony Celi, Finale Doshi-Velez, Emma Brunskill

Abstract: Off-policy policy evaluation methods for sequential decision making can be used to help identify if a proposed decision policy is better than a current baseline policy. However, a new decision policy may be better than a baseline policy for some individuals but not others. This has motivated a push towards personalization and accurate per-state estimates of heterogeneous treatment effects (HTEs).… ▽ More Off-policy policy evaluation methods for sequential decision making can be used to help identify if a proposed decision policy is better than a current baseline policy. However, a new decision policy may be better than a baseline policy for some individuals but not others. This has motivated a push towards personalization and accurate per-state estimates of heterogeneous treatment effects (HTEs). Given the limited data present in many important applications, individual predictions can come at a cost to accuracy and confidence in such predictions. We develop a method to balance the need for personalization with confident predictions by identifying subgroups where it is possible to confidently estimate the expected difference in a new decision policy relative to a baseline. We propose a novel loss function that accounts for uncertainty during the subgroup partitioning phase. In experiments, we show that our method can be used to form accurate predictions of HTEs where other methods struggle. △ Less

Submitted 28 November, 2021; originally announced November 2021.

arXiv:2110.13221 [pdf, other]

On Learning Prediction-Focused Mixtures

Authors: Abhishek Sharma, Catherine Zeng, Sanjana Narayanan, Sonali Parbhoo, Finale Doshi-Velez

Abstract: Probabilistic models help us encode latent structures that both model the data and are ideally also useful for specific downstream tasks. Among these, mixture models and their time-series counterparts, hidden Markov models, identify discrete components in the data. In this work, we focus on a constrained capacity setting, where we want to learn a model with relatively few components (e.g. for inte… ▽ More Probabilistic models help us encode latent structures that both model the data and are ideally also useful for specific downstream tasks. Among these, mixture models and their time-series counterparts, hidden Markov models, identify discrete components in the data. In this work, we focus on a constrained capacity setting, where we want to learn a model with relatively few components (e.g. for interpretability purposes). To maintain prediction performance, we introduce prediction-focused modeling for mixtures, which automatically selects the dimensions relevant to the prediction task. Our approach identifies relevant signal from the input, outperforms models that are not prediction-focused, and is easy to optimize; we also characterize when prediction-focused modeling can be expected to work. △ Less

Submitted 27 October, 2021; v1 submitted 25 October, 2021; originally announced October 2021.

arXiv:2109.11043 [pdf, other]

Learning Predictive and Interpretable Timeseries Summaries from ICU Data

Authors: Nari Johnson, Sonali Parbhoo, Andrew Slavin Ross, Finale Doshi-Velez

Abstract: Machine learning models that utilize patient data across time (rather than just the most recent measurements) have increased performance for many risk stratification tasks in the intensive care unit. However, many of these models and their learned representations are complex and therefore difficult for clinicians to interpret, creating challenges for validation. Our work proposes a new procedure t… ▽ More Machine learning models that utilize patient data across time (rather than just the most recent measurements) have increased performance for many risk stratification tasks in the intensive care unit. However, many of these models and their learned representations are complex and therefore difficult for clinicians to interpret, creating challenges for validation. Our work proposes a new procedure to learn summaries of clinical time-series that are both predictive and easily understood by humans. Specifically, our summaries consist of simple and intuitive functions of clinical data (e.g. falling mean arterial pressure). Our learned summaries outperform traditional interpretable model classes and achieve performance comparable to state-of-the-art deep learning models on an in-hospital mortality classification task. △ Less

Submitted 22 September, 2021; originally announced September 2021.

Comments: 10 pages, 3 figures, AMIA 2021 Annual Symposium

arXiv:2109.08134 [pdf, other]

Comparison and Unification of Three Regularization Methods in Batch Reinforcement Learning

Authors: Sarah Rathnam, Susan A. Murphy, Finale Doshi-Velez

Abstract: In batch reinforcement learning, there can be poorly explored state-action pairs resulting in poorly learned, inaccurate models and poorly performing associated policies. Various regularization methods can mitigate the problem of learning overly-complex models in Markov decision processes (MDPs), however they operate in technically and intuitively distinct ways and lack a common form in which to c… ▽ More In batch reinforcement learning, there can be poorly explored state-action pairs resulting in poorly learned, inaccurate models and poorly performing associated policies. Various regularization methods can mitigate the problem of learning overly-complex models in Markov decision processes (MDPs), however they operate in technically and intuitively distinct ways and lack a common form in which to compare them. This paper unifies three regularization methods in a common framework -- a weighted average transition matrix. Considering regularization methods in this common form illuminates how the MDP structure and the state-action pair distribution of the batch data set influence the relative performance of regularization methods. We confirm intuitions generated from the common framework by empirical evaluation across a range of MDPs and data collection policies. △ Less

Submitted 16 September, 2021; originally announced September 2021.

Comments: ICML Workshop on Reinforcement Learning Theory 2021

arXiv:2109.06312 [pdf, other]

Learning-to-defer for sequential medical decision-making under uncertainty

Authors: Shalmali Joshi, Sonali Parbhoo, Finale Doshi-Velez

Abstract: Learning-to-defer is a framework to automatically defer decision-making to a human expert when ML-based decisions are deemed unreliable. Existing learning-to-defer frameworks are not designed for sequential settings. That is, they defer at every instance independently, based on immediate predictions, while ignoring the potential long-term impact of these interventions. As a result, existing framew… ▽ More Learning-to-defer is a framework to automatically defer decision-making to a human expert when ML-based decisions are deemed unreliable. Existing learning-to-defer frameworks are not designed for sequential settings. That is, they defer at every instance independently, based on immediate predictions, while ignoring the potential long-term impact of these interventions. As a result, existing frameworks are myopic. Further, they do not defer adaptively, which is crucial when human interventions are costly. In this work, we propose Sequential Learning-to-Defer (SLTD), a framework for learning-to-defer to a domain expert in sequential decision-making settings. Contrary to existing literature, we pose the problem of learning-to-defer as model-based reinforcement learning (RL) to i) account for long-term consequences of ML-based actions using RL and ii) adaptively defer based on the dynamics (model-based). Our proposed framework determines whether to defer (at each time step) by quantifying whether a deferral now will improve the value compared to delaying deferral to the next time step. To quantify the improvement, we account for potential future deferrals. As a result, we learn a pre-emptive deferral policy (i.e. a policy that defers early if using the ML-based policy could worsen long-term outcomes). Our deferral policy is adaptive to the non-stationarity in the dynamics. We demonstrate that adaptive deferral via SLTD provides an improved trade-off between long-term outcomes and deferral frequency on synthetic, semi-synthetic, and real-world data with non-stationary dynamics. Finally, we interpret the deferral decision by decomposing the propagated (long-term) uncertainty around the outcome, to justify the deferral decision. △ Less

Submitted 5 December, 2022; v1 submitted 13 September, 2021; originally announced September 2021.

arXiv:2109.06310 [pdf, other]

State Relevance for Off-Policy Evaluation

Authors: Simon P. Shen, Yecheng Jason Ma, Omer Gottesman, Finale Doshi-Velez

Abstract: Importance sampling-based estimators for off-policy evaluation (OPE) are valued for their simplicity, unbiasedness, and reliance on relatively few assumptions. However, the variance of these estimators is often high, especially when trajectories are of different lengths. In this work, we introduce Omitting-States-Irrelevant-to-Return Importance Sampling (OSIRIS), an estimator which reduces varianc… ▽ More Importance sampling-based estimators for off-policy evaluation (OPE) are valued for their simplicity, unbiasedness, and reliance on relatively few assumptions. However, the variance of these estimators is often high, especially when trajectories are of different lengths. In this work, we introduce Omitting-States-Irrelevant-to-Return Importance Sampling (OSIRIS), an estimator which reduces variance by strategically omitting likelihood ratios associated with certain states. We formalize the conditions under which OSIRIS is unbiased and has lower variance than ordinary importance sampling, and we demonstrate these properties empirically. △ Less

Submitted 13 September, 2021; originally announced September 2021.

Comments: ICML 2021

Journal ref: Proceedings of the 38th International Conference on Machine Learning, PMLR 139:9537-9546, 2021

arXiv:2107.09949 [pdf, other]

Online structural kernel selection for mobile health

Authors: Eura Shin, Pedja Klasnja, Susan Murphy, Finale Doshi-Velez

Abstract: Motivated by the need for efficient and personalized learning in mobile health, we investigate the problem of online kernel selection for Gaussian Process regression in the multi-task setting. We propose a novel generative process on the kernel composition for this purpose. Our method demonstrates that trajectories of kernel evolutions can be transferred between users to improve learning and that… ▽ More Motivated by the need for efficient and personalized learning in mobile health, we investigate the problem of online kernel selection for Gaussian Process regression in the multi-task setting. We propose a novel generative process on the kernel composition for this purpose. Our method demonstrates that trajectories of kernel evolutions can be transferred between users to improve learning and that the kernels themselves are meaningful for an mHealth prediction goal. △ Less

Submitted 21 July, 2021; originally announced July 2021.

Comments: Workshop paper in ICML IMLH 2021

arXiv:2106.13314 [pdf, other]

Promises and Pitfalls of Black-Box Concept Learning Models

Authors: Anita Mahinpei, Justin Clark, Isaac Lage, Finale Doshi-Velez, Weiwei Pan

Abstract: Machine learning models that incorporate concept learning as an intermediate step in their decision making process can match the performance of black-box predictive models while retaining the ability to explain outcomes in human understandable terms. However, we demonstrate that the concept representations learned by these models encode information beyond the pre-defined concepts, and that natural… ▽ More Machine learning models that incorporate concept learning as an intermediate step in their decision making process can match the performance of black-box predictive models while retaining the ability to explain outcomes in human understandable terms. However, we demonstrate that the concept representations learned by these models encode information beyond the pre-defined concepts, and that natural mitigation strategies do not fully work, rendering the interpretation of the downstream prediction misleading. We describe the mechanism underlying the information leakage and suggest recourse for mitigating its effects. △ Less

Submitted 24 June, 2021; originally announced June 2021.

arXiv:2106.07052 [pdf, other]

Wide Mean-Field Variational Bayesian Neural Networks Ignore the Data

Authors: Beau Coker, Weiwei Pan, Finale Doshi-Velez

Abstract: Variational inference enables approximate posterior inference of the highly over-parameterized neural networks that are popular in modern machine learning. Unfortunately, such posteriors are known to exhibit various pathological behaviors. We prove that as the number of hidden units in a single-layer Bayesian neural network tends to infinity, the function-space posterior mean under mean-field vari… ▽ More Variational inference enables approximate posterior inference of the highly over-parameterized neural networks that are popular in modern machine learning. Unfortunately, such posteriors are known to exhibit various pathological behaviors. We prove that as the number of hidden units in a single-layer Bayesian neural network tends to infinity, the function-space posterior mean under mean-field variational inference actually converges to zero, completely ignoring the data. This is in contrast to the true posterior, which converges to a Gaussian process. Our work provides insight into the over-regularization of the KL divergence in variational inference. △ Less

Submitted 13 June, 2021; originally announced June 2021.

arXiv:2106.03279 [pdf, other]

Learning MDPs from Features: Predict-Then-Optimize for Sequential Decision Problems by Reinforcement Learning

Authors: Kai Wang, Sanket Shah, Haipeng Chen, Andrew Perrault, Finale Doshi-Velez, Milind Tambe

Abstract: In the predict-then-optimize framework, the objective is to train a predictive model, mapping from environment features to parameters of an optimization problem, which maximizes decision quality when the optimization is subsequently solved. Recent work on decision-focused learning shows that embedding the optimization problem in the training pipeline can improve decision quality and help generaliz… ▽ More In the predict-then-optimize framework, the objective is to train a predictive model, mapping from environment features to parameters of an optimization problem, which maximizes decision quality when the optimization is subsequently solved. Recent work on decision-focused learning shows that embedding the optimization problem in the training pipeline can improve decision quality and help generalize better to unseen tasks compared to relying on an intermediate loss function for evaluating prediction quality. We study the predict-then-optimize framework in the context of sequential decision problems (formulated as MDPs) that are solved via reinforcement learning. In particular, we are given environment features and a set of trajectories from training MDPs, which we use to train a predictive model that generalizes to unseen test MDPs without trajectories. Two significant computational challenges arise in applying decision-focused learning to MDPs: (i) large state and action spaces make it infeasible for existing techniques to differentiate through MDP problems, and (ii) the high-dimensional policy space, as parameterized by a neural network, makes differentiating through a policy expensive. We resolve the first challenge by sampling provably unbiased derivatives to approximate and differentiate through optimality conditions, and the second challenge by using a low-rank approximation to the high-dimensional sample-based derivatives. We implement both Bellman--based and policy gradient--based decision-focused learning on three different MDP problems with missing parameters, and show that decision-focused learning performs better in generalization to unseen tasks. △ Less

Submitted 16 July, 2022; v1 submitted 6 June, 2021; originally announced June 2021.

arXiv:2103.15933 [pdf, other]

Learning Under Adversarial and Interventional Shifts

Authors: Harvineet Singh, Shalmali Joshi, Finale Doshi-Velez, Himabindu Lakkaraju

Abstract: Machine learning models are often trained on data from one distribution and deployed on others. So it becomes important to design models that are robust to distribution shifts. Most of the existing work focuses on optimizing for either adversarial shifts or interventional shifts. Adversarial methods lack expressivity in representing plausible shifts as they consider shifts to joint distributions i… ▽ More Machine learning models are often trained on data from one distribution and deployed on others. So it becomes important to design models that are robust to distribution shifts. Most of the existing work focuses on optimizing for either adversarial shifts or interventional shifts. Adversarial methods lack expressivity in representing plausible shifts as they consider shifts to joint distributions in the data. Interventional methods allow more expressivity but provide robustness to unbounded shifts, resulting in overly conservative models. In this work, we combine the complementary strengths of the two approaches and propose a new formulation, RISe, for designing robust models against a set of distribution shifts that are at the intersection of adversarial and interventional shifts. We employ the distributionally robust optimization framework to optimize the resulting objective in both supervised and reinforcement learning settings. Extensive experimentation with synthetic and real world datasets from healthcare demonstrate the efficacy of the proposed approach. △ Less

Submitted 29 March, 2021; originally announced March 2021.

Comments: 19 pages including 5 pages appendix, 6 figures, 2 tables. Preliminary version presented at Causal Discovery & Causality-Inspired Machine Learning Workshop 2020

arXiv:2102.05185 [pdf, other]

Benchmarks, Algorithms, and Metrics for Hierarchical Disentanglement

Authors: Andrew Slavin Ross, Finale Doshi-Velez

Abstract: In representation learning, there has been recent interest in developing algorithms to disentangle the ground-truth generative factors behind a dataset, and metrics to quantify how fully this occurs. However, these algorithms and metrics often assume that both representations and ground-truth factors are flat, continuous, and factorized, whereas many real-world generative processes involve rich hi… ▽ More In representation learning, there has been recent interest in developing algorithms to disentangle the ground-truth generative factors behind a dataset, and metrics to quantify how fully this occurs. However, these algorithms and metrics often assume that both representations and ground-truth factors are flat, continuous, and factorized, whereas many real-world generative processes involve rich hierarchical structure, mixtures of discrete and continuous variables with dependence between them, and even varying intrinsic dimensionality. In this work, we develop benchmarks, algorithms, and metrics for learning such hierarchical representations. △ Less

Submitted 8 April, 2022; v1 submitted 9 February, 2021; originally announced February 2021.

Comments: ICML 2021 paper, fixed incorrect version upload

arXiv:2102.01264 [pdf, other]

Evaluating the Interpretability of Generative Models by Interactive Reconstruction

Authors: Andrew Slavin Ross, Nina Chen, Elisa Zhao Hang, Elena L. Glassman, Finale Doshi-Velez

Abstract: For machine learning models to be most useful in numerous sociotechnical systems, many have argued that they must be human-interpretable. However, despite increasing interest in interpretability, there remains no firm consensus on how to measure it. This is especially true in representation learning, where interpretability research has focused on "disentanglement" measures only applicable to synth… ▽ More For machine learning models to be most useful in numerous sociotechnical systems, many have argued that they must be human-interpretable. However, despite increasing interest in interpretability, there remains no firm consensus on how to measure it. This is especially true in representation learning, where interpretability research has focused on "disentanglement" measures only applicable to synthetic datasets and not grounded in human factors. We introduce a task to quantify the human-interpretability of generative model representations, where users interactively modify representations to reconstruct target instances. On synthetic datasets, we find performance on this task much more reliably differentiates entangled and disentangled models than baseline approaches. On a real dataset, we find it differentiates between representation learning methods widely believed but never shown to produce more or less interpretable models. In both cases, we ran small-scale think-aloud studies and large-scale experiments on Amazon Mechanical Turk to confirm that our qualitative and quantitative results agreed. △ Less

Submitted 1 February, 2021; originally announced February 2021.

Comments: CHI 2021 accepted paper

Showing 1–50 of 115 results for author: Doshi-velez, F