-
Feasible and Desirable Counterfactual Generation by Preserving Human Defined Constraints
Authors:
Homayun Afrabandpey,
Michael Spranger
Abstract:
We present a human-in-the-loop approach to generate counterfactual (CF) explanations that preserve global and local feasibility constraints. Global feasibility constraints refer to the causal constraints that are necessary for generating actionable CF explanation. Assuming a domain expert with knowledge on unary and binary causal constraints, our approach efficiently employs this knowledge to gene…
▽ More
We present a human-in-the-loop approach to generate counterfactual (CF) explanations that preserve global and local feasibility constraints. Global feasibility constraints refer to the causal constraints that are necessary for generating actionable CF explanation. Assuming a domain expert with knowledge on unary and binary causal constraints, our approach efficiently employs this knowledge to generate CF explanation by rejecting gradient steps that violate these constraints. Local feasibility constraints encode end-user's constraints for generating desirable CF explanation. We extract these constraints from the end-user of the model and exploit them during CF generation via user-defined distance metric. Through user studies, we demonstrate that incorporating causal constraints during CF generation results in significantly better explanations in terms of feasibility and desirability for participants. Adopting local and global feasibility constraints simultaneously, although improves user satisfaction, does not significantly improve desirability of the participants compared to only incorporating global constraints.
△ Less
Submitted 12 October, 2022;
originally announced October 2022.
-
A Decision-Theoretic Approach for Model Interpretability in Bayesian Framework
Authors:
Homayun Afrabandpey,
Tomi Peltola,
Juho Piironen,
Aki Vehtari,
Samuel Kaski
Abstract:
A salient approach to interpretable machine learning is to restrict modeling to simple models. In the Bayesian framework, this can be pursued by restricting the model structure and prior to favor interpretable models. Fundamentally, however, interpretability is about users' preferences, not the data generation mechanism; it is more natural to formulate interpretability as a utility function. In th…
▽ More
A salient approach to interpretable machine learning is to restrict modeling to simple models. In the Bayesian framework, this can be pursued by restricting the model structure and prior to favor interpretable models. Fundamentally, however, interpretability is about users' preferences, not the data generation mechanism; it is more natural to formulate interpretability as a utility function. In this work, we propose an interpretability utility, which explicates the trade-off between explanation fidelity and interpretability in the Bayesian framework. The method consists of two steps. First, a reference model, possibly a black-box Bayesian predictive model which does not compromise accuracy, is fitted to the training data. Second, a proxy model from an interpretable model family that best mimics the predictive behaviour of the reference model is found by optimizing the interpretability utility function. The approach is model agnostic -- neither the interpretable model nor the reference model are restricted to a certain class of models -- and the optimization problem can be solved using standard tools. Through experiments on real-word data sets, using decision trees as interpretable models and Bayesian additive regression models as reference models, we show that for the same level of interpretability, our approach generates more accurate models than the alternative of restricting the prior. We also propose a systematic way to measure stability of interpretabile models constructed by different interpretability approaches and show that our proposed approach generates more stable models.
△ Less
Submitted 3 August, 2020; v1 submitted 21 October, 2019;
originally announced October 2019.
-
Human-in-the-loop Active Covariance Learning for Improving Prediction in Small Data Sets
Authors:
Homayun Afrabandpey,
Tomi Peltola,
Samuel Kaski
Abstract:
Learning predictive models from small high-dimensional data sets is a key problem in high-dimensional statistics. Expert knowledge elicitation can help, and a strong line of work focuses on directly eliciting informative prior distributions for parameters. This either requires considerable statistical expertise or is laborious, as the emphasis has been on accuracy and not on efficiency of the proc…
▽ More
Learning predictive models from small high-dimensional data sets is a key problem in high-dimensional statistics. Expert knowledge elicitation can help, and a strong line of work focuses on directly eliciting informative prior distributions for parameters. This either requires considerable statistical expertise or is laborious, as the emphasis has been on accuracy and not on efficiency of the process. Another line of work queries about importance of features one at a time, assuming them to be independent and hence missing covariance information. In contrast, we propose eliciting expert knowledge about pairwise feature similarities, to borrow statistical strength in the predictions, and using sequential decision making techniques to minimize the effort of the expert. Empirical results demonstrate improvement in predictive performance on both simulated and real data, in high-dimensional linear regression tasks, where we learn the covariance structure with a Gaussian process, based on sequential elicitation.
△ Less
Submitted 18 March, 2019; v1 submitted 26 February, 2019;
originally announced February 2019.
-
Improving drug sensitivity predictions in precision medicine through active expert knowledge elicitation
Authors:
Iiris Sundin,
Tomi Peltola,
Muntasir Mamun Majumder,
Pedram Daee,
Marta Soare,
Homayun Afrabandpey,
Caroline Heckman,
Samuel Kaski,
Pekka Marttinen
Abstract:
Predicting the efficacy of a drug for a given individual, using high-dimensional genomic measurements, is at the core of precision medicine. However, identifying features on which to base the predictions remains a challenge, especially when the sample size is small. Incorporating expert knowledge offers a promising alternative to improve a prediction model, but collecting such knowledge is laborio…
▽ More
Predicting the efficacy of a drug for a given individual, using high-dimensional genomic measurements, is at the core of precision medicine. However, identifying features on which to base the predictions remains a challenge, especially when the sample size is small. Incorporating expert knowledge offers a promising alternative to improve a prediction model, but collecting such knowledge is laborious to the expert if the number of candidate features is very large. We introduce a probabilistic model that can incorporate expert feedback about the impact of genomic measurements on the sensitivity of a cancer cell for a given drug. We also present two methods to intelligently collect this feedback from the expert, using experimental design and multi-armed bandit models. In a multiple myeloma blood cancer data set (n=51), expert knowledge decreased the prediction error by 8%. Furthermore, the intelligent approaches can be used to reduce the workload of feedback collection to less than 30% on average compared to a naive approach.
△ Less
Submitted 9 May, 2017;
originally announced May 2017.
-
An EM Based Probabilistic Two-Dimensional CCA with Application to Face Recognition
Authors:
Mehran Safayani,
Seyed Hashem Ahmadi,
Homayun Afrabandpey,
Abdolreza Mirzaei
Abstract:
Recently, two-dimensional canonical correlation analysis (2DCCA) has been successfully applied for image feature extraction. The method instead of concatenating the columns of the images to the one-dimensional vectors, directly works with two-dimensional image matrices. Although 2DCCA works well in different recognition tasks, it lacks a probabilistic interpretation. In this paper, we present a pr…
▽ More
Recently, two-dimensional canonical correlation analysis (2DCCA) has been successfully applied for image feature extraction. The method instead of concatenating the columns of the images to the one-dimensional vectors, directly works with two-dimensional image matrices. Although 2DCCA works well in different recognition tasks, it lacks a probabilistic interpretation. In this paper, we present a probabilistic framework for 2DCCA called probabilistic 2DCCA (P2DCCA) and an iterative EM based algorithm for optimizing the parameters. Experimental results on synthetic and real data demonstrate superior performance in loading factor estimation for P2DCCA compared to 2DCCA. For real data, three subsets of AR face database and also the UMIST face database confirm the robustness of the proposed algorithm in face recognition tasks with different illumination conditions, facial expressions, poses and occlusions.
△ Less
Submitted 25 February, 2017;
originally announced February 2017.
-
Interactive Prior Elicitation of Feature Similarities for Small Sample Size Prediction
Authors:
Homayun Afrabandpey,
Tomi Peltola,
Samuel Kaski
Abstract:
Regression under the "small $n$, large $p$" conditions, of small sample size $n$ and large number of features $p$ in the learning data set, is a recurring setting in which learning from data is difficult. With prior knowledge about relationships of the features, $p$ can effectively be reduced, but explicating such prior knowledge is difficult for experts. In this paper we introduce a new method fo…
▽ More
Regression under the "small $n$, large $p$" conditions, of small sample size $n$ and large number of features $p$ in the learning data set, is a recurring setting in which learning from data is difficult. With prior knowledge about relationships of the features, $p$ can effectively be reduced, but explicating such prior knowledge is difficult for experts. In this paper we introduce a new method for eliciting expert prior knowledge about the similarity of the roles of features in the prediction task. The key idea is to use an interactive multidimensional-scaling (MDS) type scatterplot display of the features to elicit the similarity relationships, and then use the elicited relationships in the prior distribution of prediction parameters. Specifically, for learning to predict a target variable with Bayesian linear regression, the feature relationships are used to construct a Gaussian prior with a full covariance matrix for the regression coefficients. Evaluation of our method in experiments with simulated and real users on text data confirm that prior elicitation of feature similarities improves prediction accuracy. Furthermore, elicitation with an interactive scatterplot display outperforms straightforward elicitation where the users choose feature pairs from a feature list.
△ Less
Submitted 28 February, 2017; v1 submitted 8 December, 2016;
originally announced December 2016.
-
Visualizations Relevant to The User By Multi-View Latent Variable Factorization
Authors:
Seppo Virtanen,
Homayun Afrabandpey,
Samuel Kaski
Abstract:
A main goal of data visualization is to find, from among all the available alternatives, mappings to the 2D/3D display which are relevant to the user. Assuming user interaction data, or other auxiliary data about the items or their relationships, the goal is to identify which aspects in the primary data support the userś input and, equally importantly, which aspects of the userś potentially noisy…
▽ More
A main goal of data visualization is to find, from among all the available alternatives, mappings to the 2D/3D display which are relevant to the user. Assuming user interaction data, or other auxiliary data about the items or their relationships, the goal is to identify which aspects in the primary data support the userś input and, equally importantly, which aspects of the userś potentially noisy input have support in the primary data. For solving the problem, we introduce a multi-view embedding in which a latent factorization identifies which aspects in the two data views (primary data and user data) are related and which are specific to only one of them. The factorization is a generative model in which the display is parameterized as a part of the factorization and the other factors explain away the aspects not expressible in a two-dimensional display. Functioning of the model is demonstrated on several data sets.
△ Less
Submitted 25 January, 2016; v1 submitted 24 December, 2015;
originally announced December 2015.
-
Fuzzy Least Squares Twin Support Vector Machines
Authors:
Javad Salimi Sartakhti,
Homayun Afrabandpey,
Nasser Ghadiri
Abstract:
Least Squares Twin Support Vector Machine (LST-SVM) has been shown to be an efficient and fast algorithm for binary classification. It combines the operating principles of Least Squares SVM (LS-SVM) and Twin SVM (T-SVM); it constructs two non-parallel hyperplanes (as in T-SVM) by solving two systems of linear equations (as in LS-SVM). Despite its efficiency, LST-SVM is still unable to cope with tw…
▽ More
Least Squares Twin Support Vector Machine (LST-SVM) has been shown to be an efficient and fast algorithm for binary classification. It combines the operating principles of Least Squares SVM (LS-SVM) and Twin SVM (T-SVM); it constructs two non-parallel hyperplanes (as in T-SVM) by solving two systems of linear equations (as in LS-SVM). Despite its efficiency, LST-SVM is still unable to cope with two features of real-world problems. First, in many real-world applications, labels of samples are not deterministic; they come naturally with their associated membership degrees. Second, samples in real-world applications may not be equally important and their importance degrees affect the classification. In this paper, we propose Fuzzy LST-SVM (FLST-SVM) to deal with these two characteristics of real-world data. Two models are introduced for FLST-SVM: the first model builds up crisp hyperplanes using training samples and their corresponding membership degrees. The second model, on the other hand, constructs fuzzy hyperplanes using training samples and their membership degrees. Numerical evaluation of the proposed method with synthetic and real datasets demonstrate significant improvement in the classification accuracy of FLST-SVM when compared to well-known existing versions of SVM.
△ Less
Submitted 21 November, 2018; v1 submitted 20 May, 2015;
originally announced May 2015.