Zum Hauptinhalt springen

Showing 1–31 of 31 results for author: Casalicchio, G

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.18334  [pdf, other

    cs.LG stat.ML

    Efficient and Accurate Explanation Estimation with Distribution Compression

    Authors: Hubert Baniecki, Giuseppe Casalicchio, Bernd Bischl, Przemyslaw Biecek

    Abstract: Exact computation of various machine learning explanations requires numerous model evaluations and in extreme cases becomes impractical. The computational cost of approximation increases with an ever-increasing size of data and model parameters. Many heuristics have been proposed to approximate post-hoc explanations efficiently. This paper shows that the standard i.i.d. sampling used in a broad sp… ▽ More

    Submitted 26 June, 2024; originally announced June 2024.

    Comments: To be presented at the ICML 2024 Workshop on DMLR

  2. arXiv:2406.09069  [pdf, other

    cs.LG stat.ML

    On the Robustness of Global Feature Effect Explanations

    Authors: Hubert Baniecki, Giuseppe Casalicchio, Bernd Bischl, Przemyslaw Biecek

    Abstract: We study the robustness of global post-hoc explanations for predictive models trained on tabular data. Effects of predictor features in black-box supervised learning are an essential diagnostic tool for model debugging and scientific discovery in applied sciences. However, how vulnerable they are to data and model perturbations remains an open research question. We introduce several theoretical bo… ▽ More

    Submitted 13 June, 2024; originally announced June 2024.

    Comments: Accepted at ECML PKDD 2024

  3. arXiv:2405.02200  [pdf, other

    cs.LG stat.ML

    Position: Why We Must Rethink Empirical Research in Machine Learning

    Authors: Moritz Herrmann, F. Julian D. Lange, Katharina Eggensperger, Giuseppe Casalicchio, Marcel Wever, Matthias Feurer, David Rügamer, Eyke Hüllermeier, Anne-Laure Boulesteix, Bernd Bischl

    Abstract: We warn against a common but incomplete understanding of empirical research in machine learning that leads to non-replicable results, makes findings unreliable, and threatens to undermine progress in the field. To overcome this alarming situation, we call for more awareness of the plurality of ways of gaining knowledge experimentally but also of some epistemic limitations. In particular, we argue… ▽ More

    Submitted 25 May, 2024; v1 submitted 3 May, 2024; originally announced May 2024.

    Comments: 20 pages, accepted for publication at ICML 2024, camera-ready version

  4. arXiv:2404.16899  [pdf, other

    cs.LG

    mlr3summary: Concise and interpretable summaries for machine learning models

    Authors: Susanne Dandl, Marc Becker, Bernd Bischl, Giuseppe Casalicchio, Ludwig Bothmann

    Abstract: This work introduces a novel R package for concise, informative summaries of machine learning models. We take inspiration from the summary function for (generalized) linear models in R, but extend it in several directions: First, our summary function is model-agnostic and provides a unified summary output also for non-parametric machine learning models; Second, the summary output is more ext… ▽ More

    Submitted 25 April, 2024; originally announced April 2024.

    Comments: 9 pages

  5. arXiv:2404.12862  [pdf, other

    stat.ML cs.LG math.ST stat.ME

    A Guide to Feature Importance Methods for Scientific Inference

    Authors: Fiona Katharina Ewald, Ludwig Bothmann, Marvin N. Wright, Bernd Bischl, Giuseppe Casalicchio, Gunnar König

    Abstract: While machine learning (ML) models are increasingly used due to their high predictive power, their use in understanding the data-generating process (DGP) is limited. Understanding the DGP requires insights into feature-target associations, which many ML models cannot directly provide due to their opaque internal mechanisms. Feature importance (FI) methods provide useful insights into the DGP under… ▽ More

    Submitted 29 August, 2024; v1 submitted 19 April, 2024; originally announced April 2024.

    Journal ref: Longo, L., Lapuschkin, S., Seifert, C. (eds) Explainable Artificial Intelligence. xAI 2024. Communications in Computer and Information Science, vol 2154. Springer, Cham

  6. arXiv:2404.02629  [pdf, other

    cs.LG

    Effector: A Python package for regional explanations

    Authors: Vasilis Gkolemis, Christos Diou, Eirini Ntoutsi, Theodore Dalamagas, Bernd Bischl, Julia Herbinger, Giuseppe Casalicchio

    Abstract: Global feature effect methods explain a model outputting one plot per feature. The plot shows the average effect of the feature on the output, like the effect of age on the annual income. However, average effects may be misleading when derived from local effects that are heterogeneous, i.e., they significantly deviate from the average. To decrease the heterogeneity, regional effects provide multip… ▽ More

    Submitted 3 April, 2024; originally announced April 2024.

    Comments: 33 pages, 17 figures

  7. arXiv:2403.04629  [pdf, other

    cs.LG cs.AI cs.HC cs.RO stat.ML

    Explaining Bayesian Optimization by Shapley Values Facilitates Human-AI Collaboration

    Authors: Julian Rodemann, Federico Croppi, Philipp Arens, Yusuf Sale, Julia Herbinger, Bernd Bischl, Eyke Hüllermeier, Thomas Augustin, Conor J. Walsh, Giuseppe Casalicchio

    Abstract: Bayesian optimization (BO) with Gaussian processes (GP) has become an indispensable algorithm for black box optimization problems. Not without a dash of irony, BO is often considered a black box itself, lacking ways to provide reasons as to why certain parameters are proposed to be evaluated. This is particularly relevant in human-in-the-loop applications of BO, such as in robotics. We address thi… ▽ More

    Submitted 8 March, 2024; v1 submitted 7 March, 2024; originally announced March 2024.

    Comments: Preprint. Copyright by the authors. 19 pages, 24 figures

    ACM Class: I.2.6; I.2.9; F.2.2; J.6

  8. arXiv:2312.13234  [pdf, other

    cs.LG

    Position Paper: Bridging the Gap Between Machine Learning and Sensitivity Analysis

    Authors: Christian A. Scholbeck, Julia Moosbauer, Giuseppe Casalicchio, Hoshin Gupta, Bernd Bischl, Christian Heumann

    Abstract: We argue that interpretations of machine learning (ML) models or the model-building process can bee seen as a form of sensitivity analysis (SA), a general methodology used to explain complex systems in many fields such as environmental modeling, engineering, or economics. We address both researchers and practitioners, calling attention to the benefits of a unified SA-based view of explanations in… ▽ More

    Submitted 20 December, 2023; originally announced December 2023.

  9. arXiv:2310.03112  [pdf, other

    stat.ML cs.LG

    Leveraging Model-based Trees as Interpretable Surrogate Models for Model Distillation

    Authors: Julia Herbinger, Susanne Dandl, Fiona K. Ewald, Sofia Loibl, Giuseppe Casalicchio

    Abstract: Surrogate models play a crucial role in retrospectively interpreting complex and powerful black box machine learning models via model distillation. This paper focuses on using model-based trees as surrogate models which partition the feature space into interpretable regions via decision rules. Within each region, interpretable models based on additive main effects are used to approximate the behav… ▽ More

    Submitted 4 October, 2023; originally announced October 2023.

  10. arXiv:2310.02008  [pdf, other

    cs.LG econ.EM stat.ML

    fmeffects: An R Package for Forward Marginal Effects

    Authors: Holger Löwe, Christian A. Scholbeck, Christian Heumann, Bernd Bischl, Giuseppe Casalicchio

    Abstract: Forward marginal effects (FMEs) have recently been introduced as a versatile and effective model-agnostic interpretation method. They provide comprehensible and actionable model explanations in the form of: If we change $x$ by an amount $h$, what is the change in predicted outcome $\widehat{y}$? We present the R package fmeffects, the first software implementation of FMEs. The relevant theoretical… ▽ More

    Submitted 3 October, 2023; originally announced October 2023.

  11. arXiv:2306.00541  [pdf, other

    stat.ML cs.LG

    Decomposing Global Feature Effects Based on Feature Interactions

    Authors: Julia Herbinger, Marvin N. Wright, Thomas Nagler, Bernd Bischl, Giuseppe Casalicchio

    Abstract: Global feature effect methods, such as partial dependence plots, provide an intelligible visualization of the expected marginal feature effect. However, such global feature effect methods can be misleading, as they do not represent local feature effects of single observations well when feature interactions are present. We formally introduce generalized additive decomposition of global effects (GAD… ▽ More

    Submitted 1 July, 2024; v1 submitted 1 June, 2023; originally announced June 2023.

  12. Interpretable Regional Descriptors: Hyperbox-Based Local Explanations

    Authors: Susanne Dandl, Giuseppe Casalicchio, Bernd Bischl, Ludwig Bothmann

    Abstract: This work introduces interpretable regional descriptors, or IRDs, for local, model-agnostic interpretations. IRDs are hyperboxes that describe how an observation's feature values can be changed without affecting its prediction. They justify a prediction by providing a set of "even if" arguments (semi-factual explanations), and they indicate which features affect a prediction and whether pointwise… ▽ More

    Submitted 4 May, 2023; originally announced May 2023.

    Journal ref: Machine Learning and Knowledge Discovery in Databases: Research Track. ECML PKDD 2023. Lecture Notes in Computer Science, vol. 14171, p. 479-495

  13. arXiv:2304.06569  [pdf, other

    stat.ML cs.LG stat.CO

    counterfactuals: An R Package for Counterfactual Explanation Methods

    Authors: Susanne Dandl, Andreas Hofheinz, Martin Binder, Bernd Bischl, Giuseppe Casalicchio

    Abstract: Counterfactual explanation methods provide information on how feature values of individual observations must be changed to obtain a desired prediction. Despite the increasing amount of proposed methods in research, only a few implementations exist whose interfaces and requirements vary widely. In this work, we introduce the counterfactuals R package, which provides a modular and unified R6-based i… ▽ More

    Submitted 15 September, 2023; v1 submitted 13 April, 2023; originally announced April 2023.

    Comments: 49 pages LaTeX, updated benchmark results

  14. arXiv:2209.10578  [pdf, other

    cs.LG stat.ML

    Algorithm-Agnostic Interpretations for Clustering

    Authors: Christian A. Scholbeck, Henri Funk, Giuseppe Casalicchio

    Abstract: A clustering outcome for high-dimensional data is typically interpreted via post-processing, involving dimension reduction and subsequent visualization. This destroys the meaning of the data and obfuscates interpretations. We propose algorithm-agnostic interpretation methods to explain clustering outcomes in reduced dimensions while preserving the integrity of the data. The permutation feature imp… ▽ More

    Submitted 21 September, 2022; originally announced September 2022.

  15. arXiv:2206.05447  [pdf, other

    cs.LG stat.ML

    Improving Accuracy of Interpretability Measures in Hyperparameter Optimization via Bayesian Algorithm Execution

    Authors: Julia Moosbauer, Giuseppe Casalicchio, Marius Lindauer, Bernd Bischl

    Abstract: Despite all the benefits of automated hyperparameter optimization (HPO), most modern HPO algorithms are black-boxes themselves. This makes it difficult to understand the decision process which leads to the selected configuration, reduces trust in HPO, and thus hinders its broad adoption. Here, we study the combination of HPO with interpretable machine learning (IML) methods such as partial depende… ▽ More

    Submitted 12 February, 2023; v1 submitted 11 June, 2022; originally announced June 2022.

  16. arXiv:2202.07254  [pdf, other

    stat.ML cs.LG

    REPID: Regional Effect Plots with implicit Interaction Detection

    Authors: Julia Herbinger, Bernd Bischl, Giuseppe Casalicchio

    Abstract: Machine learning models can automatically learn complex relationships, such as non-linear and interaction effects. Interpretable machine learning methods such as partial dependence plots visualize marginal feature effects but may lead to misleading interpretations when feature interactions are present. Hence, employing additional methods that can detect and measure the strength of interactions is… ▽ More

    Submitted 15 February, 2022; originally announced February 2022.

  17. arXiv:2201.08837  [pdf, other

    cs.LG econ.EM stat.AP stat.ME stat.ML

    Marginal Effects for Non-Linear Prediction Functions

    Authors: Christian A. Scholbeck, Giuseppe Casalicchio, Christoph Molnar, Bernd Bischl, Christian Heumann

    Abstract: Beta coefficients for linear regression models represent the ideal form of an interpretable feature effect. However, for non-linear models and especially generalized linear models, the estimated coefficients cannot be interpreted as a direct feature effect on the predicted outcome. Hence, marginal effects are typically used as approximations for feature effects, either in the shape of derivatives… ▽ More

    Submitted 21 January, 2022; originally announced January 2022.

  18. arXiv:2111.04820  [pdf, other

    cs.LG stat.ML

    Explaining Hyperparameter Optimization via Partial Dependence Plots

    Authors: Julia Moosbauer, Julia Herbinger, Giuseppe Casalicchio, Marius Lindauer, Bernd Bischl

    Abstract: Automated hyperparameter optimization (HPO) can support practitioners to obtain peak performance in machine learning models. However, there is often a lack of valuable insights into the effects of different hyperparameters on the final model performance. This lack of explainability makes it difficult to trust and understand the automated HPO process and its results. We suggest using interpretable… ▽ More

    Submitted 26 January, 2022; v1 submitted 8 November, 2021; originally announced November 2021.

    Comments: to be published in proceedings of the 35th Conference on Neural Information Processing Systems (NeurIPS 2021); typos corrected, replaced N by N' in formula (6)

  19. Relating the Partial Dependence Plot and Permutation Feature Importance to the Data Generating Process

    Authors: Christoph Molnar, Timo Freiesleben, Gunnar König, Giuseppe Casalicchio, Marvin N. Wright, Bernd Bischl

    Abstract: Scientists and practitioners increasingly rely on machine learning to model data and draw conclusions. Compared to statistical modeling approaches, machine learning makes fewer explicit assumptions about data structures, such as linearity. However, their model parameters usually cannot be easily related to the data generating process. To learn about the modeled relationships, partial dependence (P… ▽ More

    Submitted 3 September, 2021; originally announced September 2021.

    Journal ref: Longo, L. (eds) Explainable Artificial Intelligence. xAI 2023. Communications in Computer and Information Science, vol 1901

  20. arXiv:2107.14330  [pdf, ps, other

    cs.CY cs.LG

    Developing Open Source Educational Resources for Machine Learning and Data Science

    Authors: Ludwig Bothmann, Sven Strickroth, Giuseppe Casalicchio, David Rügamer, Marius Lindauer, Fabian Scheipl, Bernd Bischl

    Abstract: Education should not be a privilege but a common good. It should be openly accessible to everyone, with as few barriers as possible; even more so for key technologies such as Machine Learning (ML) and Data Science (DS). Open Educational Resources (OER) are a crucial factor for greater educational equity. In this paper, we describe the specific requirements for OER in ML and DS and argue that it is… ▽ More

    Submitted 10 August, 2021; v1 submitted 28 July, 2021; originally announced July 2021.

    Comments: 6 pages

    Journal ref: Proceedings of the Third Teaching Machine Learning and Artificial Intelligence Workshop, PMLR 207:1-6, 2022

  21. arXiv:2106.08086  [pdf, other

    stat.ML cs.LG

    Decomposition of Global Feature Importance into Direct and Associative Components (DEDACT)

    Authors: Gunnar König, Timo Freiesleben, Bernd Bischl, Giuseppe Casalicchio, Moritz Grosse-Wentrup

    Abstract: Global model-agnostic feature importance measures either quantify whether features are directly used for a model's predictions (direct importance) or whether they contain prediction-relevant information (associative importance). Direct importance provides causal insight into the model's mechanism, yet it fails to expose the leakage of information from associated but not directly used variables. In… ▽ More

    Submitted 15 June, 2021; originally announced June 2021.

  22. Grouped Feature Importance and Combined Features Effect Plot

    Authors: Quay Au, Julia Herbinger, Clemens Stachl, Bernd Bischl, Giuseppe Casalicchio

    Abstract: Interpretable machine learning has become a very active area of research due to the rising popularity of machine learning algorithms and their inherently challenging interpretability. Most work in this area has been focused on the interpretation of single features in a model. However, for researchers and practitioners, it is often equally important to quantify the importance or visualize the effec… ▽ More

    Submitted 23 April, 2021; originally announced April 2021.

    Journal ref: Data Mining and Knowledge Discovery 36, 1401--1450 (2022)

  23. Interpretable Machine Learning -- A Brief History, State-of-the-Art and Challenges

    Authors: Christoph Molnar, Giuseppe Casalicchio, Bernd Bischl

    Abstract: We present a brief history of the field of interpretable machine learning (IML), give an overview of state-of-the-art interpretation methods, and discuss challenges. Research in IML has boomed in recent years. As young as the field is, it has over 200 years old roots in regression modeling and rule-based machine learning, starting in the 1960s. Recently, many new IML methods have been proposed, ma… ▽ More

    Submitted 19 October, 2020; originally announced October 2020.

    Journal ref: Koprinska I. et al. (eds) ECML PKDD 2020 Workshops. ECML PKDD 2020. Communications in Computer and Information Science, vol 1323. Springer, Cham

  24. arXiv:2007.04131  [pdf, other

    stat.ML cs.LG

    General Pitfalls of Model-Agnostic Interpretation Methods for Machine Learning Models

    Authors: Christoph Molnar, Gunnar König, Julia Herbinger, Timo Freiesleben, Susanne Dandl, Christian A. Scholbeck, Giuseppe Casalicchio, Moritz Grosse-Wentrup, Bernd Bischl

    Abstract: An increasing number of model-agnostic interpretation techniques for machine learning (ML) models such as partial dependence plots (PDP), permutation feature importance (PFI) and Shapley values provide insightful model interpretations, but can lead to wrong conclusions if applied incorrectly. We highlight many general pitfalls of ML model interpretation, such as using interpretation techniques in… ▽ More

    Submitted 17 August, 2021; v1 submitted 8 July, 2020; originally announced July 2020.

  25. Model-agnostic Feature Importance and Effects with Dependent Features -- A Conditional Subgroup Approach

    Authors: Christoph Molnar, Gunnar König, Bernd Bischl, Giuseppe Casalicchio

    Abstract: The interpretation of feature importance in machine learning models is challenging when features are dependent. Permutation feature importance (PFI) ignores such dependencies, which can cause misleading interpretations due to extrapolation. A possible remedy is more advanced conditional PFI approaches that enable the assessment of feature importance conditional on all other features. Due to this s… ▽ More

    Submitted 21 June, 2021; v1 submitted 8 June, 2020; originally announced June 2020.

    Journal ref: Data Mining and Knowledge Discovery (2023)

  26. Sampling, Intervention, Prediction, Aggregation: A Generalized Framework for Model-Agnostic Interpretations

    Authors: Christian A. Scholbeck, Christoph Molnar, Christian Heumann, Bernd Bischl, Giuseppe Casalicchio

    Abstract: Model-agnostic interpretation techniques allow us to explain the behavior of any predictive model. Due to different notations and terminology, it is difficult to see how they are related. A unified view on these methods has been missing. We present the generalized SIPA (sampling, intervention, prediction, aggregation) framework of work stages for model-agnostic interpretations and demonstrate how… ▽ More

    Submitted 13 February, 2020; v1 submitted 8 April, 2019; originally announced April 2019.

    Report number: Cellier P., Driessens K. (eds) Machine Learning and Knowledge Discovery in Databases. ECML PKDD 2019. Communications in Computer and Information Science, vol 1167. Springer, Cham

  27. arXiv:1904.03943  [pdf, other

    stat.ML cs.LG

    Component-Wise Boosting of Targets for Multi-Output Prediction

    Authors: Quay Au, Daniel Schalk, Giuseppe Casalicchio, Ramona Schoedel, Clemens Stachl, Bernd Bischl

    Abstract: Multi-output prediction deals with the prediction of several targets of possibly diverse types. One way to address this problem is the so called problem transformation method. This method is often used in multi-label learning, but can also be used for multi-output prediction due to its generality and simplicity. In this paper, we introduce an algorithm that uses the problem transformation method f… ▽ More

    Submitted 8 April, 2019; originally announced April 2019.

  28. Quantifying Model Complexity via Functional Decomposition for Better Post-Hoc Interpretability

    Authors: Christoph Molnar, Giuseppe Casalicchio, Bernd Bischl

    Abstract: Post-hoc model-agnostic interpretation methods such as partial dependence plots can be employed to interpret complex machine learning models. While these interpretation methods can be applied regardless of model complexity, they can produce misleading and verbose results if the model is too complex, especially w.r.t. feature interactions. To quantify the complexity of arbitrary machine learning mo… ▽ More

    Submitted 23 September, 2019; v1 submitted 8 April, 2019; originally announced April 2019.

    Journal ref: Cellier P., Driessens K. (eds) Machine Learning and Knowledge Discovery in Databases. ECML PKDD 2019. Communications in Computer and Information Science, vol 1167. Springer, Cham

  29. Visualizing the Feature Importance for Black Box Models

    Authors: Giuseppe Casalicchio, Christoph Molnar, Bernd Bischl

    Abstract: In recent years, a large amount of model-agnostic methods to improve the transparency, trustability and interpretability of machine learning models have been developed. We introduce local feature importance as a local version of a recent model-agnostic global feature importance method. Based on local feature importance, we propose two visual tools: partial importance (PI) and individual conditiona… ▽ More

    Submitted 28 December, 2018; v1 submitted 18 April, 2018; originally announced April 2018.

    Comments: To Appear in Machine Learning and Knowledge Discovery in Databases: European Conference, ECML PKDD 2018, Dublin, Ireland, September 10 to 14, 2018, Proceedings, Part I

    Journal ref: Machine Learning and Knowledge Discovery in Databases. ECML PKDD 2018. Lecture Notes in Computer Science, vol 11051

  30. arXiv:1708.03731  [pdf, other

    stat.ML cs.LG

    OpenML Benchmarking Suites

    Authors: Bernd Bischl, Giuseppe Casalicchio, Matthias Feurer, Pieter Gijsbers, Frank Hutter, Michel Lang, Rafael G. Mantovani, Jan N. van Rijn, Joaquin Vanschoren

    Abstract: Machine learning research depends on objectively interpretable, comparable, and reproducible algorithm benchmarks. We advocate the use of curated, comprehensive suites of machine learning tasks to standardize the setup, execution, and reporting of benchmarks. We enable this through software tools that help to create and leverage these benchmarking suites. These are seamlessly integrated into the O… ▽ More

    Submitted 22 November, 2021; v1 submitted 11 August, 2017; originally announced August 2017.

    Comments: Accepted for publication in the Proceedings of the Neural Information Processing Systems Track on Datasets and Benchmarks (NeurIPS 2021)

    Journal ref: Thirty-fifth Conference on Neural Information Processing Systems Datasets and Benchmarks Track (2021)

  31. OpenML: An R Package to Connect to the Machine Learning Platform OpenML

    Authors: Giuseppe Casalicchio, Jakob Bossek, Michel Lang, Dominik Kirchhoff, Pascal Kerschke, Benjamin Hofner, Heidi Seibold, Joaquin Vanschoren, Bernd Bischl

    Abstract: OpenML is an online machine learning platform where researchers can easily share data, machine learning tasks and experiments as well as organize them online to work and collaborate more efficiently. In this paper, we present an R package to interface with the OpenML platform and illustrate its usage in combination with the machine learning R package mlr. We show how the OpenML package allows R us… ▽ More

    Submitted 4 May, 2017; v1 submitted 5 January, 2017; originally announced January 2017.

    Journal ref: Computational Statistics, 2019, 34. Jg., Nr. 3, S. 977-991