Skip to main content

Showing 1–29 of 29 results for author: Heller, K

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.04993  [pdf

    q-bio.NC cs.LG

    Development and Validation of a Deep-Learning Model for Differential Treatment Benefit Prediction for Adults with Major Depressive Disorder Deployed in the Artificial Intelligence in Depression Medication Enhancement (AIDME) Study

    Authors: David Benrimoh, Caitrin Armstrong, Joseph Mehltretter, Robert Fratila, Kelly Perlman, Sonia Israel, Adam Kapelner, Sagar V. Parikh, Jordan F. Karp, Katherine Heller, Gustavo Turecki

    Abstract: INTRODUCTION: The pharmacological treatment of Major Depressive Disorder (MDD) relies on a trial-and-error approach. We introduce an artificial intelligence (AI) model aiming to personalize treatment and improve outcomes, which was deployed in the Artificial Intelligence in Depression Medication Enhancement (AIDME) Study. OBJECTIVES: 1) Develop a model capable of predicting probabilities of remiss… ▽ More

    Submitted 7 June, 2024; originally announced June 2024.

  2. arXiv:2403.12025  [pdf, other

    cs.CY cs.CL cs.LG

    A Toolbox for Surfacing Health Equity Harms and Biases in Large Language Models

    Authors: Stephen R. Pfohl, Heather Cole-Lewis, Rory Sayres, Darlene Neal, Mercy Asiedu, Awa Dieng, Nenad Tomasev, Qazi Mamunur Rashid, Shekoofeh Azizi, Negar Rostamzadeh, Liam G. McCoy, Leo Anthony Celi, Yun Liu, Mike Schaekermann, Alanna Walton, Alicia Parrish, Chirag Nagpal, Preeti Singh, Akeiylah Dewitt, Philip Mansfield, Sushant Prakash, Katherine Heller, Alan Karthikesalingam, Christopher Semturs, Joelle Barral , et al. (5 additional authors not shown)

    Abstract: Large language models (LLMs) hold immense promise to serve complex health information needs but also have the potential to introduce harm and exacerbate health disparities. Reliably evaluating equity-related model failures is a critical step toward developing systems that promote health equity. In this work, we present resources and methodologies for surfacing biases with potential to precipitate… ▽ More

    Submitted 18 March, 2024; originally announced March 2024.

  3. arXiv:2403.03357  [pdf, other

    cs.AI cs.CY

    The Case for Globalizing Fairness: A Mixed Methods Study on Colonialism, AI, and Health in Africa

    Authors: Mercy Asiedu, Awa Dieng, Iskandar Haykel, Negar Rostamzadeh, Stephen Pfohl, Chirag Nagpal, Maria Nagawa, Abigail Oppong, Sanmi Koyejo, Katherine Heller

    Abstract: With growing application of machine learning (ML) technologies in healthcare, there have been calls for developing techniques to understand and mitigate biases these systems may exhibit. Fair-ness considerations in the development of ML-based solutions for health have particular implications for Africa, which already faces inequitable power imbalances between the Global North and South.This paper… ▽ More

    Submitted 11 March, 2024; v1 submitted 5 March, 2024; originally announced March 2024.

    Comments: 11 pages, 4 figures. arXiv admin note: text overlap with arXiv:2304.02190

  4. arXiv:2312.09244  [pdf, other

    cs.LG

    Helping or Herding? Reward Model Ensembles Mitigate but do not Eliminate Reward Hacking

    Authors: Jacob Eisenstein, Chirag Nagpal, Alekh Agarwal, Ahmad Beirami, Alex D'Amour, DJ Dvijotham, Adam Fisch, Katherine Heller, Stephen Pfohl, Deepak Ramachandran, Peter Shaw, Jonathan Berant

    Abstract: Reward models play a key role in aligning language model applications towards human preferences. However, this setup creates an incentive for the language model to exploit errors in the reward model to achieve high estimated reward, a phenomenon often termed \emph{reward hacking}. A natural mitigation is to train an ensemble of reward models, aggregating over model outputs to obtain a more robust… ▽ More

    Submitted 20 December, 2023; v1 submitted 14 December, 2023; originally announced December 2023.

  5. arXiv:2309.17249  [pdf, other

    cs.CL cs.AI cs.LG

    Batch Calibration: Rethinking Calibration for In-Context Learning and Prompt Engineering

    Authors: Han Zhou, Xingchen Wan, Lev Proleev, Diana Mincu, Jilin Chen, Katherine Heller, Subhrajit Roy

    Abstract: Prompting and in-context learning (ICL) have become efficient learning paradigms for large language models (LLMs). However, LLMs suffer from prompt brittleness and various bias factors in the prompt, including but not limited to the formatting, the choice verbalizers, and the ICL examples. To address this problem that results in unexpected performance degradation, calibration methods have been dev… ▽ More

    Submitted 24 January, 2024; v1 submitted 29 September, 2023; originally announced September 2023.

    Comments: ICLR 2024. 9 pages, 9 figures, 3 tables (22 pages, 11 figures, 11 tables including references and appendices)

  6. arXiv:2306.07946  [pdf, other

    cs.SI cs.AI cs.IR

    STUDY: Socially Aware Temporally Causal Decoder Recommender Systems

    Authors: Eltayeb Ahmed, Diana Mincu, Lauren Harrell, Katherine Heller, Subhrajit Roy

    Abstract: Recommender systems are widely used to help people find items that are tailored to their interests. These interests are often influenced by social networks, making it important to use social network information effectively in recommender systems. This is especially true for demographic groups with interests that differ from the majority. This paper introduces STUDY, a Socially-aware Temporally caU… ▽ More

    Submitted 5 September, 2023; v1 submitted 2 June, 2023; originally announced June 2023.

    Comments: 15 pages, 5 figures

  7. arXiv:2304.02190  [pdf, other

    cs.LG cs.AI cs.CY

    Globalizing Fairness Attributes in Machine Learning: A Case Study on Health in Africa

    Authors: Mercy Nyamewaa Asiedu, Awa Dieng, Abigail Oppong, Maria Nagawa, Sanmi Koyejo, Katherine Heller

    Abstract: With growing machine learning (ML) applications in healthcare, there have been calls for fairness in ML to understand and mitigate ethical concerns these systems may pose. Fairness has implications for global health in Africa, which already has inequitable power imbalances between the Global North and South. This paper seeks to explore fairness for global health, with Africa as a case study. We pr… ▽ More

    Submitted 4 April, 2023; originally announced April 2023.

  8. arXiv:2302.07854  [pdf, other

    cs.LG

    Benchmarking Continuous Time Models for Predicting Multiple Sclerosis Progression

    Authors: Alexander Norcliffe, Lev Proleev, Diana Mincu, Fletcher Lee Hartsell, Katherine Heller, Subhrajit Roy

    Abstract: Multiple sclerosis is a disease that affects the brain and spinal cord, it can lead to severe disability and has no known cure. The majority of prior work in machine learning for multiple sclerosis has been centered around using Magnetic Resonance Imaging scans or laboratory tests; these modalities are both expensive to acquire and can be unreliable. In a recent paper it was shown that disease pro… ▽ More

    Submitted 9 September, 2023; v1 submitted 15 February, 2023; originally announced February 2023.

    Comments: 32 pages, 2 figures, 17 tables, published in TMLR 2023

  9. arXiv:2302.03874  [pdf, other

    cs.LG cs.CY

    Participatory Personalization in Classification

    Authors: Hailey Joren, Chirag Nagpal, Katherine Heller, Berk Ustun

    Abstract: Machine learning models are often personalized with information that is protected, sensitive, self-reported, or costly to acquire. These models use information about people but do not facilitate nor inform their consent. Individuals cannot opt out of reporting personal information to a model, nor tell if they benefit from personalization in the first place. We introduce a family of classification… ▽ More

    Submitted 11 October, 2023; v1 submitted 7 February, 2023; originally announced February 2023.

    Comments: 37th Conference on Neural Information Processing Systems (NeurIPS 2023)

  10. arXiv:2205.05256  [pdf, other

    cs.LG

    Evaluation Gaps in Machine Learning Practice

    Authors: Ben Hutchinson, Negar Rostamzadeh, Christina Greer, Katherine Heller, Vinodkumar Prabhakaran

    Abstract: Forming a reliable judgement of a machine learning (ML) model's appropriateness for an application ecosystem is critical for its responsible use, and requires considering a broad range of factors including harms, benefits, and responsibilities. In practice, however, evaluations of ML models frequently focus on only a narrow range of decontextualized predictive behaviours. We examine the evaluation… ▽ More

    Submitted 11 May, 2022; originally announced May 2022.

  11. arXiv:2204.03969  [pdf, other

    cs.LG

    Disability prediction in multiple sclerosis using performance outcome measures and demographic data

    Authors: Subhrajit Roy, Diana Mincu, Lev Proleev, Negar Rostamzadeh, Chintan Ghate, Natalie Harris, Christina Chen, Jessica Schrouff, Nenad Tomasev, Fletcher Lee Hartsell, Katherine Heller

    Abstract: Literature on machine learning for multiple sclerosis has primarily focused on the use of neuroimaging data such as magnetic resonance imaging and clinical laboratory tests for disease identification. However, studies have shown that these modalities are not consistent with disease activity such as symptoms or disease progression. Furthermore, the cost of collecting data from these modalities is h… ▽ More

    Submitted 8 April, 2022; originally announced April 2022.

  12. arXiv:2202.13028  [pdf, ps, other

    cs.AI cs.HC

    Healthsheet: Development of a Transparency Artifact for Health Datasets

    Authors: Negar Rostamzadeh, Diana Mincu, Subhrajit Roy, Andrew Smart, Lauren Wilcox, Mahima Pushkarna, Jessica Schrouff, Razvan Amironesei, Nyalleng Moorosi, Katherine Heller

    Abstract: Machine learning (ML) approaches have demonstrated promising results in a wide range of healthcare applications. Data plays a crucial role in developing ML-based healthcare systems that directly affect people's lives. Many of the ethical issues surrounding the use of ML in healthcare stem from structural inequalities underlying the way we collect, use, and handle data. Developing guidelines to imp… ▽ More

    Submitted 25 February, 2022; originally announced February 2022.

  13. arXiv:2202.01034  [pdf, other

    cs.LG cs.CY stat.ML

    Diagnosing failures of fairness transfer across distribution shift in real-world medical settings

    Authors: Jessica Schrouff, Natalie Harris, Oluwasanmi Koyejo, Ibrahim Alabdulmohsin, Eva Schnider, Krista Opsahl-Ong, Alex Brown, Subhrajit Roy, Diana Mincu, Christina Chen, Awa Dieng, Yuan Liu, Vivek Natarajan, Alan Karthikesalingam, Katherine Heller, Silvia Chiappa, Alexander D'Amour

    Abstract: Diagnosing and mitigating changes in model fairness under distribution shift is an important component of the safe deployment of machine learning in healthcare settings. Importantly, the success of any mitigation strategy strongly depends on the structure of the shift. Despite this, there has been little discussion of how to empirically assess the structure of a distribution shift that one is enco… ▽ More

    Submitted 10 February, 2023; v1 submitted 2 February, 2022; originally announced February 2022.

    Journal ref: Advances in Neural Information Processing Systems 35 (NeurIPS 2022)

  14. arXiv:2106.15980  [pdf, other

    stat.ML cs.LG stat.CO

    Variational Refinement for Importance Sampling Using the Forward Kullback-Leibler Divergence

    Authors: Ghassen Jerfel, Serena Wang, Clara Fannjiang, Katherine A. Heller, Yian Ma, Michael I. Jordan

    Abstract: Variational Inference (VI) is a popular alternative to asymptotically exact sampling in Bayesian inference. Its main workhorse is optimization over a reverse Kullback-Leibler divergence (RKL), which typically underestimates the tail of the posterior leading to miscalibration and potential degeneracy. Importance sampling (IS), on the other hand, is often used to fine-tune and de-bias the estimates… ▽ More

    Submitted 30 June, 2021; originally announced June 2021.

    Comments: Accepted for the 37th Conference on Uncertainty in Artificial Intelligence (UAI 2021)

  15. arXiv:2101.06536  [pdf, other

    cs.LG stat.ME stat.ML

    Deep Cox Mixtures for Survival Regression

    Authors: Chirag Nagpal, Steve Yadlowsky, Negar Rostamzadeh, Katherine Heller

    Abstract: Survival analysis is a challenging variation of regression modeling because of the presence of censoring, where the outcome measurement is only partially known, due to, for example, loss to follow up. Such problems come up frequently in medical applications, making survival analysis a key endeavor in biostatistics and machine learning for healthcare, with Cox regression models being amongst the mo… ▽ More

    Submitted 26 June, 2022; v1 submitted 16 January, 2021; originally announced January 2021.

    Comments: Machine Learning for Healthcare Conference, 2021

    Journal ref: Proceedings of the 6th Machine Learning for Healthcare Conference, PMLR 149:674-708, 2021

  16. arXiv:2011.03395  [pdf, other

    cs.LG stat.ML

    Underspecification Presents Challenges for Credibility in Modern Machine Learning

    Authors: Alexander D'Amour, Katherine Heller, Dan Moldovan, Ben Adlam, Babak Alipanahi, Alex Beutel, Christina Chen, Jonathan Deaton, Jacob Eisenstein, Matthew D. Hoffman, Farhad Hormozdiari, Neil Houlsby, Shaobo Hou, Ghassen Jerfel, Alan Karthikesalingam, Mario Lucic, Yian Ma, Cory McLean, Diana Mincu, Akinori Mitani, Andrea Montanari, Zachary Nado, Vivek Natarajan, Christopher Nielson, Thomas F. Osborne , et al. (15 additional authors not shown)

    Abstract: ML models often exhibit unexpectedly poor behavior when they are deployed in real-world domains. We identify underspecification as a key reason for these failures. An ML pipeline is underspecified when it can return many predictors with equivalently strong held-out performance in the training domain. Underspecification is common in modern ML pipelines, such as those based on deep learning. Predict… ▽ More

    Submitted 24 November, 2020; v1 submitted 6 November, 2020; originally announced November 2020.

    Comments: Updates: Updated statistical analysis in Section 6; Additional citations

  17. arXiv:2005.07186  [pdf, other

    cs.LG stat.ML

    Efficient and Scalable Bayesian Neural Nets with Rank-1 Factors

    Authors: Michael W. Dusenberry, Ghassen Jerfel, Yeming Wen, Yi-An Ma, Jasper Snoek, Katherine Heller, Balaji Lakshminarayanan, Dustin Tran

    Abstract: Bayesian neural networks (BNNs) demonstrate promising success in improving the robustness and uncertainty quantification of modern deep learning. However, they generally struggle with underfitting at scale and parameter efficiency. On the other hand, deep ensembles have emerged as alternatives for uncertainty quantification that, while outperforming BNNs on certain problems, also suffer from effic… ▽ More

    Submitted 14 August, 2020; v1 submitted 14 May, 2020; originally announced May 2020.

    Comments: Published in the International Conference on Machine Learning (ICML) 2020. Code available at https://github.com/google/edward2

  18. arXiv:1911.05861  [pdf, other

    cs.LG stat.ML

    Federated and Differentially Private Learning for Electronic Health Records

    Authors: Stephen R. Pfohl, Andrew M. Dai, Katherine Heller

    Abstract: The use of collaborative and decentralized machine learning techniques such as federated learning have the potential to enable the development and deployment of clinical risk predictions models in low-resource settings without requiring sensitive data be shared or stored in a central repository. This process necessitates communication of model weights or updates between collaborating entities, but… ▽ More

    Submitted 13 November, 2019; originally announced November 2019.

    Comments: Machine Learning for Health (ML4H) at NeurIPS 2019 - Extended Abstract

  19. Analyzing the Role of Model Uncertainty for Electronic Health Records

    Authors: Michael W. Dusenberry, Dustin Tran, Edward Choi, Jonas Kemp, Jeremy Nixon, Ghassen Jerfel, Katherine Heller, Andrew M. Dai

    Abstract: In medicine, both ethical and monetary costs of incorrect predictions can be significant, and the complexity of the problems often necessitates increasingly complex models. Recent work has shown that changing just the random seed is enough for otherwise well-tuned deep neural networks to vary in their individual predicted probabilities. In light of this, we investigate the role of model uncertaint… ▽ More

    Submitted 25 March, 2020; v1 submitted 10 June, 2019; originally announced June 2019.

    Comments: Published in the ACM Conference on Health, Inference, and Learning (CHIL) 2020. Code available at https://github.com/Google-Health/records-research

  20. arXiv:1812.06080  [pdf, other

    cs.LG stat.ML

    Reconciling meta-learning and continual learning with online mixtures of tasks

    Authors: Ghassen Jerfel, Erin Grant, Thomas L. Griffiths, Katherine Heller

    Abstract: Learning-to-learn or meta-learning leverages data-driven inductive bias to increase the efficiency of learning on a novel task. This approach encounters difficulty when transfer is not advantageous, for instance, when tasks are considerably dissimilar or change over time. We use the connection between gradient-based meta-learning and hierarchical Bayes to propose a Dirichlet process mixture of hie… ▽ More

    Submitted 19 June, 2019; v1 submitted 14 December, 2018; originally announced December 2018.

    Comments: updated experimental results

  21. arXiv:1809.03648  [pdf, other

    cs.SI

    Who Started It? Identifying Root Sources in Textual Conversation Threads

    Authors: Wei Zhang, Fan Bu, Derek Owens-Oas, Katherine Heller, Xiaojin Zhu

    Abstract: In textual conversation threads, as found on many popular social media platforms, each particular user text comment either originates a new thread of discussion, or replies to a previous comment. An individual who makes an original comment ---termed as the "root source''---is a topic initiator or even an information source, and identifying such individuals is of particular interest. The reply stru… ▽ More

    Submitted 9 September, 2019; v1 submitted 10 September, 2018; originally announced September 2018.

  22. arXiv:1712.00202  [pdf, other

    cs.CV

    InverseNet: Solving Inverse Problems with Splitting Networks

    Authors: Kai Fan, Qi Wei, Wenlin Wang, Amit Chakraborty, Katherine Heller

    Abstract: We propose a new method that uses deep learning techniques to solve the inverse problems. The inverse problem is cast in the form of learning an end-to-end mapping from observed data to the ground-truth. Inspired by the splitting strategy widely used in regularized iterative algorithm to tackle inverse problems, the mapping is decomposed into two networks, with one handling the inversion of the ph… ▽ More

    Submitted 1 December, 2017; originally announced December 2017.

  23. arXiv:1709.01841  [pdf, other

    cs.CV

    An inner-loop free solution to inverse problems using deep neural networks

    Authors: Qi Wei, Kai Fan, Lawrence Carin, Katherine A. Heller

    Abstract: We propose a new method that uses deep learning techniques to accelerate the popular alternating direction method of multipliers (ADMM) solution for inverse problems. The ADMM updates consist of a proximity operator, a least squares regression that includes a big matrix inversion, and an explicit solution for updating the dual variables. Typically, inner loops are required to solve the first two s… ▽ More

    Submitted 14 November, 2017; v1 submitted 6 September, 2017; originally announced September 2017.

    Journal ref: 2017 Conference on Neural Information Processing Systems (NIPS)

  24. arXiv:1411.2674  [pdf, other

    stat.ML cs.CL cs.LG cs.SI

    The Bayesian Echo Chamber: Modeling Social Influence via Linguistic Accommodation

    Authors: Fangjian Guo, Charles Blundell, Hanna Wallach, Katherine Heller

    Abstract: We present the Bayesian Echo Chamber, a new Bayesian generative model for social interaction data. By modeling the evolution of people's language usage over time, this model discovers latent influence relationships between them. Unlike previous work on inferring influence, which has primarily focused on simple temporal dynamics evidenced via turn-taking behavior, our model captures more nuanced in… ▽ More

    Submitted 27 January, 2015; v1 submitted 10 November, 2014; originally announced November 2014.

    Comments: 14 pages, 7 figures, to appear in AISTATS 2015. Fixed minor formatting issues

  25. arXiv:1210.4864  [pdf

    cs.SI physics.soc-ph stat.AP

    Graph-Coupled HMMs for Modeling the Spread of Infection

    Authors: Wen Dong, Alex Pentland, Katherine A. Heller

    Abstract: We develop Graph-Coupled Hidden Markov Models (GCHMMs) for modeling the spread of infectious disease locally within a social network. Unlike most previous research in epidemiology, which typically models the spread of infection at the level of entire populations, we successfully leverage mobile phone data collected from 84 people over an extended period of time to model the spread of infection on… ▽ More

    Submitted 16 October, 2012; originally announced October 2012.

    Comments: Appears in Proceedings of the Twenty-Eighth Conference on Uncertainty in Artificial Intelligence (UAI2012)

    Report number: UAI-P-2012-PG-227-236

  26. arXiv:1204.0168  [pdf

    stat.AP cs.MA cs.SI physics.soc-ph

    Modeling Infection with Multi-agent Dynamics

    Authors: Wen Dong, Katherine A. Heller, Alex Sandy Pentland

    Abstract: Developing the ability to comprehensively study infections in small populations enables us to improve epidemic models and better advise individuals about potential risks to their health. We currently have a limited understanding of how infections spread within a small population because it has been difficult to closely track an infection within a complete community. The paper presents data closely… ▽ More

    Submitted 11 October, 2014; v1 submitted 1 April, 2012; originally announced April 2012.

  27. arXiv:1203.3468  [pdf

    cs.LG stat.ML

    Bayesian Rose Trees

    Authors: Charles Blundell, Yee Whye Teh, Katherine A. Heller

    Abstract: Hierarchical structure is ubiquitous in data across many domains. There are many hierarchical clustering methods, frequently used by domain experts, which strive to discover this structure. However, most of these methods limit discoverable hierarchies to those with binary branching structure. This limitation, while computationally convenient, is often undesirable. In this paper we explore a Bayesi… ▽ More

    Submitted 15 March, 2012; originally announced March 2012.

    Comments: Appears in Proceedings of the Twenty-Sixth Conference on Uncertainty in Artificial Intelligence (UAI2010)

    Report number: UAI-P-2010-PG-65-72

  28. arXiv:1106.1157  [pdf, other

    cs.LG cs.AI stat.ML

    Bayesian and L1 Approaches to Sparse Unsupervised Learning

    Authors: Shakir Mohamed, Katherine Heller, Zoubin Ghahramani

    Abstract: The use of L1 regularisation for sparse learning has generated immense research interest, with successful application in such diverse areas as signal acquisition, image coding, genomics and collaborative filtering. While existing work highlights the many advantages of L1 methods, in this paper we find that L1 regularisation often dramatically underperforms in terms of predictive performance when c… ▽ More

    Submitted 17 August, 2012; v1 submitted 6 June, 2011; originally announced June 2011.

    Comments: In Proceedings of the 29th International Conference on Machine Learning (ICML), Edinburgh, Scotland, 2012

  29. arXiv:0912.5193  [pdf, ps, other

    stat.ME cs.LG physics.soc-ph q-bio.QM stat.AP

    Ranking relations using analogies in biological and information networks

    Authors: Ricardo Silva, Katherine Heller, Zoubin Ghahramani, Edoardo M. Airoldi

    Abstract: Analogical reasoning depends fundamentally on the ability to learn and generalize about relations between objects. We develop an approach to relational learning which, given a set of pairs of objects $\mathbf{S}=\{A^{(1)}:B^{(1)},A^{(2)}:B^{(2)},\ldots,A^{(N)}:B ^{(N)}\}$, measures how well other pairs A:B fit in with the set $\mathbf{S}$. Our work addresses the following question: is the relation… ▽ More

    Submitted 29 August, 2013; v1 submitted 28 December, 2009; originally announced December 2009.

    Comments: Published in at http://dx.doi.org/10.1214/09-AOAS321 the Annals of Applied Statistics (http://www.imstat.org/aoas/) by the Institute of Mathematical Statistics (http://www.imstat.org)

    Report number: IMS-AOAS-AOAS321

    Journal ref: Annals of Applied Statistics 2010, Vol. 4, No. 2, 615-644