Zum Hauptinhalt springen

Showing 1–14 of 14 results for author: Franceschi, L

Searching in archive cs. Search in all archives.
.
  1. arXiv:2407.12872  [pdf, other

    cs.CL cs.LG

    Evaluating Large Language Models with fmeval

    Authors: Pola Schwöbel, Luca Franceschi, Muhammad Bilal Zafar, Keerthan Vasist, Aman Malhotra, Tomer Shenhar, Pinal Tailor, Pinar Yilmaz, Michael Diamond, Michele Donini

    Abstract: fmeval is an open source library to evaluate large language models (LLMs) in a range of tasks. It helps practitioners evaluate their model for task performance and along multiple responsible AI dimensions. This paper presents the library and exposes its underlying design principles: simplicity, coverage, extensibility and performance. We then present how these were implemented in the scientific an… ▽ More

    Submitted 15 July, 2024; originally announced July 2024.

  2. arXiv:2402.09947  [pdf, other

    cs.LG

    Explaining Probabilistic Models with Distributional Values

    Authors: Luca Franceschi, Michele Donini, Cédric Archambeau, Matthias Seeger

    Abstract: A large branch of explainable machine learning is grounded in cooperative game theory. However, research indicates that game-theoretic explanations may mislead or be hard to interpret. We argue that often there is a critical mismatch between what one wishes to explain (e.g. the output of a classifier) and what current methods such as SHAP explain (e.g. the scalar probability of a class). This pape… ▽ More

    Submitted 14 June, 2024; v1 submitted 15 February, 2024; originally announced February 2024.

    Comments: ICML 2024 (spotlight paper). Code: https://github.com/amazon-science/explaining-probabilistic-models-with-distributinal-values

  3. arXiv:2301.11898  [pdf, other

    cs.LG cs.AI stat.ML

    DAG Learning on the Permutahedron

    Authors: Valentina Zantedeschi, Luca Franceschi, Jean Kaddour, Matt J. Kusner, Vlad Niculae

    Abstract: We propose a continuous optimization framework for discovering a latent directed acyclic graph (DAG) from observational data. Our approach optimizes over the polytope of permutation vectors, the so-called Permutahedron, to learn a topological ordering. Edges can be optimized jointly, or learned conditional on the ordering via a non-differentiable subroutine. Compared to existing continuous optimiz… ▽ More

    Submitted 10 February, 2023; v1 submitted 27 January, 2023; originally announced January 2023.

    Comments: The Eleventh International Conference on Learning Representations

  4. arXiv:2210.15353  [pdf, other

    cs.LG cs.AI stat.ML

    Learning Discrete Directed Acyclic Graphs via Backpropagation

    Authors: Andrew J. Wren, Pasquale Minervini, Luca Franceschi, Valentina Zantedeschi

    Abstract: Recently continuous relaxations have been proposed in order to learn Directed Acyclic Graphs (DAGs) from data by backpropagation, instead of using combinatorial optimization. However, a number of techniques for fully discrete backpropagation could instead be applied. In this paper, we explore that direction and propose DAG-DB, a framework for learning DAGs by Discrete Backpropagation. Based on the… ▽ More

    Submitted 27 October, 2022; originally announced October 2022.

    Comments: 15 pages, 2 figures, 7 tables. Accepted for NeurIPS 2022 workshops on: Causal Machine Learning for Real-World Impact; and Neuro Causal and Symbolic AI

  5. arXiv:2209.04862  [pdf, other

    cs.LG cs.AI cs.CL cs.NE

    Adaptive Perturbation-Based Gradient Estimation for Discrete Latent Variable Models

    Authors: Pasquale Minervini, Luca Franceschi, Mathias Niepert

    Abstract: The integration of discrete algorithmic components in deep learning architectures has numerous applications. Recently, Implicit Maximum Likelihood Estimation (IMLE, Niepert, Minervini, and Franceschi 2021), a class of gradient estimators for discrete exponential family distributions, was proposed by combining implicit differentiation through perturbation with the path-wise gradient estimator. Howe… ▽ More

    Submitted 5 February, 2023; v1 submitted 11 September, 2022; originally announced September 2022.

    Comments: Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence (AAAI 2023)

  6. arXiv:2207.09980  [pdf, other

    cs.LG cs.AI cs.CL

    ReFactor GNNs: Revisiting Factorisation-based Models from a Message-Passing Perspective

    Authors: Yihong Chen, Pushkar Mishra, Luca Franceschi, Pasquale Minervini, Pontus Stenetorp, Sebastian Riedel

    Abstract: Factorisation-based Models (FMs), such as DistMult, have enjoyed enduring success for Knowledge Graph Completion (KGC) tasks, often outperforming Graph Neural Networks (GNNs). However, unlike GNNs, FMs struggle to incorporate node features and generalise to unseen nodes in inductive settings. Our work bridges the gap between FMs and GNNs by proposing ReFactor GNNs. This new architecture draws upon… ▽ More

    Submitted 27 October, 2022; v1 submitted 20 July, 2022; originally announced July 2022.

    Comments: 36th Conference on Neural Information Processing Systems (NeurIPS 2022)

    MSC Class: 68T05; 68T07; 68T50 ACM Class: I.2.7; I.2.6

  7. arXiv:2106.01798  [pdf, other

    cs.LG cs.AI

    Implicit MLE: Backpropagating Through Discrete Exponential Family Distributions

    Authors: Mathias Niepert, Pasquale Minervini, Luca Franceschi

    Abstract: Combining discrete probability distributions and combinatorial optimization problems with neural network components has numerous applications but poses several challenges. We propose Implicit Maximum Likelihood Estimation (I-MLE), a framework for end-to-end learning of models combining discrete exponential family distributions and differentiable neural components. I-MLE is widely applicable as it… ▽ More

    Submitted 27 October, 2021; v1 submitted 3 June, 2021; originally announced June 2021.

    Comments: NeurIPS 2021 camera-ready; repo: https://github.com/nec-research/tf-imle

  8. arXiv:2006.16218  [pdf, other

    stat.ML cs.LG

    On the Iteration Complexity of Hypergradient Computation

    Authors: Riccardo Grazzi, Luca Franceschi, Massimiliano Pontil, Saverio Salzo

    Abstract: We study a general class of bilevel problems, consisting in the minimization of an upper-level objective which depends on the solution to a parametric fixed-point equation. Important instances arising in machine learning include hyperparameter optimization, meta-learning, and certain graph and recurrent neural networks. Typically the gradient of the upper-level objective (hypergradient) is hard or… ▽ More

    Submitted 10 July, 2020; v1 submitted 29 June, 2020; originally announced June 2020.

    Comments: accepted at ICML 2020; 19 pages, 4 figures; code at https://github.com/prolearner/hypertorch (corrected typos and one reference)

  9. arXiv:1910.08525  [pdf, other

    cs.LG stat.ML

    MARTHE: Scheduling the Learning Rate Via Online Hypergradients

    Authors: Michele Donini, Luca Franceschi, Massimiliano Pontil, Orchid Majumder, Paolo Frasconi

    Abstract: We study the problem of fitting task-specific learning rate schedules from the perspective of hyperparameter optimization, aiming at good generalization. We describe the structure of the gradient of a validation error w.r.t. the learning rate schedule -- the hypergradient. Based on this, we introduce MARTHE, a novel online algorithm guided by cheap approximations of the hypergradient that uses pas… ▽ More

    Submitted 17 May, 2020; v1 submitted 18 October, 2019; originally announced October 2019.

    Comments: IJCAI 2020. Larger images. Code available at https://github.com/awslabs/adatune

  10. arXiv:1903.11960  [pdf, other

    cs.LG stat.ML

    Learning Discrete Structures for Graph Neural Networks

    Authors: Luca Franceschi, Mathias Niepert, Massimiliano Pontil, Xiao He

    Abstract: Graph neural networks (GNNs) are a popular class of machine learning models whose major advantage is their ability to incorporate a sparse and discrete dependency structure between data points. Unfortunately, GNNs can only be used when such a graph-structure is available. In practice, however, real-world graphs are often noisy and incomplete or might not be available at all. With this work, we pro… ▽ More

    Submitted 19 June, 2020; v1 submitted 28 March, 2019; originally announced March 2019.

    Comments: ICML 2019, code at https://github.com/lucfra/LDS - Revision of Sec. 3

  11. Fast and Continuous Foothold Adaptation for Dynamic Locomotion through CNNs

    Authors: Octavio Villarreal, Victor Barasuol, Marco Camurri, Luca Franceschi, Michele Focchi, Massimiliano Pontil, Darwin G. Caldwell, Claudio Semini

    Abstract: Legged robots can outperform wheeled machines for most navigation tasks across unknown and rough terrains. For such tasks, visual feedback is a fundamental asset to provide robots with terrain-awareness. However, robust dynamic locomotion on difficult terrains with real-time performance guarantees remains a challenge. We present here a real-time, dynamic foothold adaptation strategy based on visua… ▽ More

    Submitted 15 February, 2019; v1 submitted 25 September, 2018; originally announced September 2018.

    Comments: 9 pages, 11 figures. Accepted to RA-L + ICRA 2019, January 2019

  12. arXiv:1806.04941  [pdf, other

    cs.MS cs.LG stat.ML

    Far-HO: A Bilevel Programming Package for Hyperparameter Optimization and Meta-Learning

    Authors: Luca Franceschi, Riccardo Grazzi, Massimiliano Pontil, Saverio Salzo, Paolo Frasconi

    Abstract: In (Franceschi et al., 2018) we proposed a unified mathematical framework, grounded on bilevel programming, that encompasses gradient-based hyperparameter optimization and meta-learning. We formulated an approximate version of the problem where the inner objective is solved iteratively, and gave sufficient conditions ensuring convergence to the exact problem. In this work we show how to optimize l… ▽ More

    Submitted 13 June, 2018; originally announced June 2018.

    Comments: This submission is a reduced version of (Franceschi et al., arXiv:1806.04910) which has been accepted at the main ICML 2018 conference. In this paper we illustrate the software framework, material that could not be included in the conference paper

  13. arXiv:1806.04910  [pdf, other

    stat.ML cs.LG

    Bilevel Programming for Hyperparameter Optimization and Meta-Learning

    Authors: Luca Franceschi, Paolo Frasconi, Saverio Salzo, Riccardo Grazzi, Massimilano Pontil

    Abstract: We introduce a framework based on bilevel programming that unifies gradient-based hyperparameter optimization and meta-learning. We show that an approximate version of the bilevel problem can be solved by taking into explicit account the optimization dynamics for the inner objective. Depending on the specific setting, the outer variables take either the meaning of hyperparameters in a supervised l… ▽ More

    Submitted 3 July, 2018; v1 submitted 13 June, 2018; originally announced June 2018.

    Comments: ICML 2018; code for replicating experiments at https://github.com/prolearner/hyper-representation, main package (Far-HO) at https://github.com/lucfra/FAR-HO

  14. arXiv:1712.06283  [pdf, other

    stat.ML cs.LG

    A Bridge Between Hyperparameter Optimization and Learning-to-learn

    Authors: Luca Franceschi, Michele Donini, Paolo Frasconi, Massimiliano Pontil

    Abstract: We consider a class of a nested optimization problems involving inner and outer objectives. We observe that by taking into explicit account the optimization dynamics for the inner objective it is possible to derive a general framework that unifies gradient-based hyperparameter optimization and meta-learning (or learning-to-learn). Depending on the specific setting, the variables of the outer objec… ▽ More

    Submitted 21 August, 2019; v1 submitted 18 December, 2017; originally announced December 2017.

    Comments: NIPS 2017 workshop on Meta-learning (http://metalearning.ml/)