-
AIRIVA: A Deep Generative Model of Adaptive Immune Repertoires
Authors:
Melanie F. Pradier,
Niranjani Prasad,
Paidamoyo Chapfuwa,
Sahra Ghalebikesabi,
Max Ilse,
Steven Woodhouse,
Rebecca Elyanow,
Javier Zazo,
Javier Gonzalez,
Julia Greissl,
Edward Meeds
Abstract:
Recent advances in immunomics have shown that T-cell receptor (TCR) signatures can accurately predict active or recent infection by leveraging the high specificity of TCR binding to disease antigens. However, the extreme diversity of the adaptive immune repertoire presents challenges in reliably identifying disease-specific TCRs. Population genetics and sequencing depth can also have strong system…
▽ More
Recent advances in immunomics have shown that T-cell receptor (TCR) signatures can accurately predict active or recent infection by leveraging the high specificity of TCR binding to disease antigens. However, the extreme diversity of the adaptive immune repertoire presents challenges in reliably identifying disease-specific TCRs. Population genetics and sequencing depth can also have strong systematic effects on repertoires, which requires careful consideration when developing diagnostic models. We present an Adaptive Immune Repertoire-Invariant Variational Autoencoder (AIRIVA), a generative model that learns a low-dimensional, interpretable, and compositional representation of TCR repertoires to disentangle such systematic effects in repertoires. We apply AIRIVA to two infectious disease case-studies: COVID-19 (natural infection and vaccination) and the Herpes Simplex Virus (HSV-1 and HSV-2), and empirically show that we can disentangle the individual disease signals. We further demonstrate AIRIVA's capability to: learn from unlabelled samples; generate in-silico TCR repertoires by intervening on the latent factors; and identify disease-associated TCRs validated using TCR annotations from external assay data.
△ Less
Submitted 26 April, 2023;
originally announced April 2023.
-
Repairing Neural Networks by Leaving the Right Past Behind
Authors:
Ryutaro Tanno,
Melanie F. Pradier,
Aditya Nori,
Yingzhen Li
Abstract:
Prediction failures of machine learning models often arise from deficiencies in training data, such as incorrect labels, outliers, and selection biases. However, such data points that are responsible for a given failure mode are generally not known a priori, let alone a mechanism for repairing the failure. This work draws on the Bayesian view of continual learning, and develops a generic framework…
▽ More
Prediction failures of machine learning models often arise from deficiencies in training data, such as incorrect labels, outliers, and selection biases. However, such data points that are responsible for a given failure mode are generally not known a priori, let alone a mechanism for repairing the failure. This work draws on the Bayesian view of continual learning, and develops a generic framework for both, identifying training examples that have given rise to the target failure, and fixing the model through erasing information about them. This framework naturally allows leveraging recent advances in continual learning to this new problem of model repairment, while subsuming the existing works on influence functions and data deletion as specific instances. Experimentally, the proposed approach outperforms the baselines for both identification of detrimental training data and fixing model failures in a generalisable manner.
△ Less
Submitted 9 November, 2022; v1 submitted 11 July, 2022;
originally announced July 2022.
-
Designing AI for Trust and Collaboration in Time-Constrained Medical Decisions: A Sociotechnical Lens
Authors:
Maia Jacobs,
Jeffrey He,
Melanie F. Pradier,
Barbara Lam,
Andrew C. Ahn,
Thomas H. McCoy,
Roy H. Perlis,
Finale Doshi-Velez,
Krzysztof Z. Gajos
Abstract:
Major depressive disorder is a debilitating disease affecting 264 million people worldwide. While many antidepressant medications are available, few clinical guidelines support choosing among them. Decision support tools (DSTs) embodying machine learning models may help improve the treatment selection process, but often fail in clinical practice due to poor system integration.
We use an iterativ…
▽ More
Major depressive disorder is a debilitating disease affecting 264 million people worldwide. While many antidepressant medications are available, few clinical guidelines support choosing among them. Decision support tools (DSTs) embodying machine learning models may help improve the treatment selection process, but often fail in clinical practice due to poor system integration.
We use an iterative, co-design process to investigate clinicians' perceptions of using DSTs in antidepressant treatment decisions. We identify ways in which DSTs need to engage with the healthcare sociotechnical system, including clinical processes, patient preferences, resource constraints, and domain knowledge. Our results suggest that clinical DSTs should be designed as multi-user systems that support patient-provider collaboration and offer on-demand explanations that address discrepancies between predictions and current standards of care. Through this work, we demonstrate how current trends in explainable AI may be inappropriate for clinical environments and consider paths towards designing these tools for real-world medical systems.
△ Less
Submitted 31 January, 2021;
originally announced February 2021.
-
Preferential Mixture-of-Experts: Interpretable Models that Rely on Human Expertise as much as Possible
Authors:
Melanie F. Pradier,
Javier Zazo,
Sonali Parbhoo,
Roy H. Perlis,
Maurizio Zazzi,
Finale Doshi-Velez
Abstract:
We propose Preferential MoE, a novel human-ML mixture-of-experts model that augments human expertise in decision making with a data-based classifier only when necessary for predictive performance. Our model exhibits an interpretable gating function that provides information on when human rules should be followed or avoided. The gating function is maximized for using human-based rules, and classifi…
▽ More
We propose Preferential MoE, a novel human-ML mixture-of-experts model that augments human expertise in decision making with a data-based classifier only when necessary for predictive performance. Our model exhibits an interpretable gating function that provides information on when human rules should be followed or avoided. The gating function is maximized for using human-based rules, and classification errors are minimized. We propose solving a coupled multi-objective problem with convex subproblems. We develop approximate algorithms and study their performance and convergence. Finally, we demonstrate the utility of Preferential MoE on two clinical applications for the treatment of Human Immunodeficiency Virus (HIV) and management of Major Depressive Disorder (MDD).
△ Less
Submitted 13 January, 2021;
originally announced January 2021.
-
Towards Expressive Priors for Bayesian Neural Networks: Poisson Process Radial Basis Function Networks
Authors:
Beau Coker,
Melanie F. Pradier,
Finale Doshi-Velez
Abstract:
While Bayesian neural networks have many appealing characteristics, current priors do not easily allow users to specify basic properties such as expected lengthscale or amplitude variance. In this work, we introduce Poisson Process Radial Basis Function Networks, a novel prior that is able to encode amplitude stationarity and input-dependent lengthscale. We prove that our novel formulation allows…
▽ More
While Bayesian neural networks have many appealing characteristics, current priors do not easily allow users to specify basic properties such as expected lengthscale or amplitude variance. In this work, we introduce Poisson Process Radial Basis Function Networks, a novel prior that is able to encode amplitude stationarity and input-dependent lengthscale. We prove that our novel formulation allows for a decoupled specification of these properties, and that the estimated regression function is consistent as the number of observations tends to infinity. We demonstrate its behavior on synthetic and real examples.
△ Less
Submitted 12 December, 2019;
originally announced December 2019.
-
Output-Constrained Bayesian Neural Networks
Authors:
Wanqian Yang,
Lars Lorch,
Moritz A. Graule,
Srivatsan Srinivasan,
Anirudh Suresh,
Jiayu Yao,
Melanie F. Pradier,
Finale Doshi-Velez
Abstract:
Bayesian neural network (BNN) priors are defined in parameter space, making it hard to encode prior knowledge expressed in function space. We formulate a prior that incorporates functional constraints about what the output can or cannot be in regions of the input space. Output-Constrained BNNs (OC-BNN) represent an interpretable approach of enforcing a range of constraints, fully consistent with t…
▽ More
Bayesian neural network (BNN) priors are defined in parameter space, making it hard to encode prior knowledge expressed in function space. We formulate a prior that incorporates functional constraints about what the output can or cannot be in regions of the input space. Output-Constrained BNNs (OC-BNN) represent an interpretable approach of enforcing a range of constraints, fully consistent with the Bayesian framework and amenable to black-box inference. We demonstrate how OC-BNNs improve model robustness and prevent the prediction of infeasible outputs in two real-world applications of healthcare and robotics.
△ Less
Submitted 15 May, 2019;
originally announced May 2019.
-
Unsupervised Extraction of Phenotypes from Cancer Clinical Notes for Association Studies
Authors:
Stefan G. Stark,
Stephanie L. Hyland,
Melanie F. Pradier,
Kjong Lehmann,
Andreas Wicki,
Fernando Perez Cruz,
Julia E. Vogt,
Gunnar Rätsch
Abstract:
The recent adoption of Electronic Health Records (EHRs) by health care providers has introduced an important source of data that provides detailed and highly specific insights into patient phenotypes over large cohorts. These datasets, in combination with machine learning and statistical approaches, generate new opportunities for research and clinical care. However, many methods require the patien…
▽ More
The recent adoption of Electronic Health Records (EHRs) by health care providers has introduced an important source of data that provides detailed and highly specific insights into patient phenotypes over large cohorts. These datasets, in combination with machine learning and statistical approaches, generate new opportunities for research and clinical care. However, many methods require the patient representations to be in structured formats, while the information in the EHR is often locked in unstructured texts designed for human readability. In this work, we develop the methodology to automatically extract clinical features from clinical narratives from large EHR corpora without the need for prior knowledge. We consider medical terms and sentences appearing in clinical narratives as atomic information units. We propose an efficient clustering strategy suitable for the analysis of large text corpora and to utilize the clusters to represent information about the patient compactly. To demonstrate the utility of our approach, we perform an association study of clinical features with somatic mutation profiles from 4,007 cancer patients and their tumors. We apply the proposed algorithm to a dataset consisting of about 65 thousand documents with a total of about 3.2 million sentences. We identify 341 significant statistical associations between the presence of somatic mutations and clinical features. We annotated these associations according to their novelty, and report several known associations. We also propose 32 testable hypotheses where the underlying biological mechanism does not appear to be known but plausible. These results illustrate that the automated discovery of clinical features is possible and the joint analysis of clinical and genetic datasets can generate appealing new hypotheses.
△ Less
Submitted 3 May, 2019; v1 submitted 29 April, 2019;
originally announced April 2019.
-
Projected BNNs: Avoiding weight-space pathologies by learning latent representations of neural network weights
Authors:
Melanie F. Pradier,
Weiwei Pan,
Jiayu Yao,
Soumya Ghosh,
Finale Doshi-velez
Abstract:
As machine learning systems get widely adopted for high-stake decisions, quantifying uncertainty over predictions becomes crucial. While modern neural networks are making remarkable gains in terms of predictive accuracy, characterizing uncertainty over the parameters of these models is challenging because of the high dimensionality and complex correlations of the network parameter space. This pape…
▽ More
As machine learning systems get widely adopted for high-stake decisions, quantifying uncertainty over predictions becomes crucial. While modern neural networks are making remarkable gains in terms of predictive accuracy, characterizing uncertainty over the parameters of these models is challenging because of the high dimensionality and complex correlations of the network parameter space. This paper introduces a novel variational inference framework for Bayesian neural networks that (1) encodes complex distributions in high-dimensional parameter space with representations in a low-dimensional latent space, and (2) performs inference efficiently on the low-dimensional representations. Across a large array of synthetic and real-world datasets, we show that our method improves uncertainty characterization and model generalization when compared with methods that work directly in the parameter space.
△ Less
Submitted 12 June, 2019; v1 submitted 16 November, 2018;
originally announced November 2018.
-
Sparse Three-parameter Restricted Indian Buffet Process for Understanding International Trade
Authors:
Melanie F. Pradier,
Viktor Stojkoski,
Zoran Utkovski,
Ljupco Kocarev,
Fernando Perez-Cruz
Abstract:
This paper presents a Bayesian nonparametric latent feature model specially suitable for exploratory analysis of high-dimensional count data. We perform a non-negative doubly sparse matrix factorization that has two main advantages: not only we are able to better approximate the row input distributions, but the inferred topics are also easier to interpret. By combining the three-parameter and rest…
▽ More
This paper presents a Bayesian nonparametric latent feature model specially suitable for exploratory analysis of high-dimensional count data. We perform a non-negative doubly sparse matrix factorization that has two main advantages: not only we are able to better approximate the row input distributions, but the inferred topics are also easier to interpret. By combining the three-parameter and restricted Indian buffet processes into a single prior, we increase the model flexibility, allowing for a full spectrum of sparse solutions in the latent space. We demonstrate the usefulness of our approach in the analysis of countries' economic structure. Compared to other approaches, empirical results show our model's ability to give easy-to-interpret information and better capture the underlying sparsity structure of data.
△ Less
Submitted 29 June, 2018;
originally announced June 2018.
-
General Latent Feature Modeling for Data Exploration Tasks
Authors:
Isabel Valera,
Melanie F. Pradier,
Zoubin Ghahramani
Abstract:
This paper introduces a general Bayesian non- parametric latent feature model suitable to per- form automatic exploratory analysis of heterogeneous datasets, where the attributes describing each object can be either discrete, continuous or mixed variables. The proposed model presents several important properties. First, it accounts for heterogeneous data while can be inferred in linear time with r…
▽ More
This paper introduces a general Bayesian non- parametric latent feature model suitable to per- form automatic exploratory analysis of heterogeneous datasets, where the attributes describing each object can be either discrete, continuous or mixed variables. The proposed model presents several important properties. First, it accounts for heterogeneous data while can be inferred in linear time with respect to the number of objects and attributes. Second, its Bayesian nonparametric nature allows us to automatically infer the model complexity from the data, i.e., the number of features necessary to capture the latent structure in the data. Third, the latent features in the model are binary-valued variables, easing the interpretability of the obtained latent features in data exploration tasks.
△ Less
Submitted 26 July, 2017;
originally announced July 2017.