Zum Hauptinhalt springen

Showing 1–12 of 12 results for author: Provost, F

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.09567  [pdf, other

    stat.ML cs.LG

    Causal Fine-Tuning and Effect Calibration of Non-Causal Predictive Models

    Authors: Carlos Fernández-Loría, Yanfang Hou, Foster Provost, Jennifer Hill

    Abstract: This paper proposes techniques to enhance the performance of non-causal models for causal inference using data from randomized experiments. In domains like advertising, customer retention, and precision medicine, non-causal models that predict outcomes under no intervention are often used to score individuals and rank them according to the expected effectiveness of an intervention (e.g, an ad, a r… ▽ More

    Submitted 13 June, 2024; originally announced June 2024.

  2. arXiv:2312.15000  [pdf, other

    cs.CY

    The Impact of Cloaking Digital Footprints on User Privacy and Personalization

    Authors: Sofie Goethals, Sandra Matz, Foster Provost, Yanou Ramon, David Martens

    Abstract: Our online lives generate a wealth of behavioral records -'digital footprints'- which are stored and leveraged by technology platforms. This data can be used to create value for users by personalizing services. At the same time, however, it also poses a threat to people's privacy by offering a highly intimate window into their private traits (e.g., their personality, political ideology, sexual ori… ▽ More

    Submitted 22 December, 2023; originally announced December 2023.

  3. arXiv:2104.04103  [pdf, other

    stat.ML cs.LG

    Causal Decision Making and Causal Effect Estimation Are Not the Same... and Why It Matters

    Authors: Carlos Fernández-Loría, Foster Provost

    Abstract: Causal decision making (CDM) based on machine learning has become a routine part of business. Businesses algorithmically target offers, incentives, and recommendations to affect consumer behavior. Recently, we have seen an acceleration of research related to CDM and causal effect estimation (CEE) using machine-learned models. This article highlights an important perspective: CDM is not the same as… ▽ More

    Submitted 30 September, 2021; v1 submitted 8 April, 2021; originally announced April 2021.

  4. arXiv:2004.11532  [pdf, other

    econ.EM cs.LG stat.ME stat.ML

    A Comparison of Methods for Treatment Assignment with an Application to Playlist Generation

    Authors: Carlos Fernández-Loría, Foster Provost, Jesse Anderton, Benjamin Carterette, Praveen Chandar

    Abstract: This study presents a systematic comparison of methods for individual treatment assignment, a general problem that arises in many applications and has received significant attention from economists, computer scientists, and social scientists. We group the various methods proposed in the literature into three general classes of algorithms (or metalearners): learning models to predict outcomes (the… ▽ More

    Submitted 30 April, 2022; v1 submitted 24 April, 2020; originally announced April 2020.

  5. arXiv:2001.07417  [pdf, other

    cs.LG cs.AI stat.ML

    Explaining Data-Driven Decisions made by AI Systems: The Counterfactual Approach

    Authors: Carlos Fernández-Loría, Foster Provost, Xintian Han

    Abstract: We examine counterfactual explanations for explaining the decisions made by model-based AI systems. The counterfactual approach we consider defines an explanation as a set of the system's data inputs that causally drives the decision (i.e., changing the inputs in the set changes the decision) and is irreducible (i.e., changing any subset of the inputs does not change the decision). We (1) demonstr… ▽ More

    Submitted 13 October, 2021; v1 submitted 21 January, 2020; originally announced January 2020.

  6. Counterfactual Explanation Algorithms for Behavioral and Textual Data

    Authors: Yanou Ramon, David Martens, Foster Provost, Theodoros Evgeniou

    Abstract: We study the interpretability of predictive systems that use high-dimensonal behavioral and textual data. Examples include predicting product interest based on online browsing data and detecting spam emails or objectionable web content. Recently, counterfactual explanations have been proposed for generating insight into model predictions, which focus on what is relevant to a particular instance. C… ▽ More

    Submitted 4 December, 2019; originally announced December 2019.

    Comments: 24 pages, 7 figures, currently under review

  7. arXiv:1706.03102  [pdf

    cs.CY

    Big Data, Data Science, and Civil Rights

    Authors: Solon Barocas, Elizabeth Bradley, Vasant Honavar, Foster Provost

    Abstract: Advances in data analytics bring with them civil rights implications. Data-driven and algorithmic decision making increasingly determine how businesses target advertisements to consumers, how police departments monitor individuals or groups, how banks decide who gets a loan and who does not, how employers hire, how colleges and universities make admissions and financial aid decisions, and much mor… ▽ More

    Submitted 9 June, 2017; originally announced June 2017.

    Comments: A Computing Community Consortium (CCC) white paper, 8 pages

  8. arXiv:1607.06280  [pdf, other

    stat.ML cs.LG

    Explaining Classification Models Built on High-Dimensional Sparse Data

    Authors: Julie Moeyersoms, Brian d'Alessandro, Foster Provost, David Martens

    Abstract: Predictive modeling applications increasingly use data representing people's behavior, opinions, and interactions. Fine-grained behavior data often has different structure from traditional data, being very high-dimensional and sparse. Models built from these data are quite difficult to interpret, since they contain many thousands or even many millions of features. Listing features with large model… ▽ More

    Submitted 26 July, 2016; v1 submitted 21 July, 2016; originally announced July 2016.

    Comments: 5 pages, 1 figure, 2 Tables; ICML conference, Workshop on Human Interpretability In Machine Learning

  9. arXiv:1606.08063  [pdf, ps, other

    stat.ML cs.CY

    Enhancing Transparency and Control when Drawing Data-Driven Inferences about Individuals

    Authors: Daizhuo Chen, Samuel P. Fraiberger, Robert Moakler, Foster Provost

    Abstract: Recent studies have shown that information disclosed on social network sites (such as Facebook) can be used to predict personal characteristics with surprisingly high accuracy. In this paper we examine a method to give online users transparency into why certain inferences are made about them by statistical models, and control to inhibit those inferences by hiding ("cloaking") certain personal info… ▽ More

    Submitted 26 June, 2016; originally announced June 2016.

    Comments: presented at 2016 ICML Workshop on Human Interpretability in Machine Learning (WHI 2016), New York, NY

  10. Learning When Training Data are Costly: The Effect of Class Distribution on Tree Induction

    Authors: F. Provost, G. M. Weiss

    Abstract: For large, real-world inductive learning problems, the number of training examples often must be limited due to the costs associated with procuring, preparing, and storing the training examples and/or the computational costs associated with learning from them. In such circumstances, one question of practical importance is: if only n training examples can be selected, in what proportion should the… ▽ More

    Submitted 22 June, 2011; originally announced June 2011.

    Journal ref: Journal Of Artificial Intelligence Research, Volume 19, pages 315-354, 2003

  11. arXiv:cs/0010006  [pdf, ps, other

    cs.LG cs.DB

    Applications of Data Mining to Electronic Commerce

    Authors: Ron Kohavi, Foster Provost

    Abstract: Electronic commerce is emerging as the killer domain for data mining technology. The following are five desiderata for success. Seldom are they they all present in one data mining application. 1. Data with rich descriptions. For example, wide customer records with many potentially useful fields allow data mining algorithms to search beyond obvious correlations. 2. A large volume of data. T… ▽ More

    Submitted 2 October, 2000; originally announced October 2000.

    Comments: Editorial for special issue

    ACM Class: I.2.6; H.2.8

  12. arXiv:cs/0009007  [pdf, ps, other

    cs.LG

    Robust Classification for Imprecise Environments

    Authors: Foster Provost, Tom Fawcett

    Abstract: In real-world environments it usually is difficult to specify target operating conditions precisely, for example, target misclassification costs. This uncertainty makes building robust classification systems problematic. We show that it is possible to build a hybrid classifier that will perform at least as well as the best available classifier for any target conditions. In some cases, the perfor… ▽ More

    Submitted 13 September, 2000; originally announced September 2000.

    Comments: 24 pages, 12 figures. To be published in Machine Learning Journal. For related papers, see http://www.hpl.hp.com/personal/Tom_Fawcett/ROCCH/

    ACM Class: I.2.6