Zum Hauptinhalt springen

Showing 1–48 of 48 results for author: Jamnik, M

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.04165  [pdf, other

    cs.LG

    Repurposing Language Models into Embedding Models: Finding the Compute-Optimal Recipe

    Authors: Alicja Ziarko, Albert Q. Jiang, Bartosz Piotrowski, Wenda Li, Mateja Jamnik, Piotr Miłoś

    Abstract: Text embeddings are essential for many tasks, such as document retrieval, clustering, and semantic similarity assessment. In this paper, we study how to contrastively train text embedding models in a compute-optimal fashion, given a suite of pre-trained decoder-only language models. Our innovation is an algorithm that produces optimal configurations of model sizes, data quantities, and fine-tuning… ▽ More

    Submitted 6 June, 2024; originally announced June 2024.

  2. arXiv:2406.01805  [pdf, other

    cs.LG cs.AI

    TabMDA: Tabular Manifold Data Augmentation for Any Classifier using Transformers with In-context Subsetting

    Authors: Andrei Margeloiu, Adrián Bazaga, Nikola Simidjievski, Pietro Liò, Mateja Jamnik

    Abstract: Tabular data is prevalent in many critical domains, yet it is often challenging to acquire in large quantities. This scarcity usually results in poor performance of machine learning models on such data. Data augmentation, a common strategy for performance improvement in vision and language tasks, typically underperforms for tabular data due to the lack of explicit symmetries in the input space. To… ▽ More

    Submitted 29 July, 2024; v1 submitted 3 June, 2024; originally announced June 2024.

    Comments: Presented at 1st ICML Workshop on In-Context Learning (ICL @ ICML 2024)

  3. arXiv:2405.19950  [pdf, other

    cs.LG cs.AI

    MM-Lego: Modular Biomedical Multimodal Models with Minimal Fine-Tuning

    Authors: Konstantin Hemker, Nikola Simidjievski, Mateja Jamnik

    Abstract: Learning holistic computational representations in physical, chemical or biological systems requires the ability to process information from different distributions and modalities within the same model. Thus, the demand for multimodal machine learning models has sharply risen for modalities that go beyond vision and language, such as sequences, graphs, time series, or tabular data. While there are… ▽ More

    Submitted 30 May, 2024; originally announced May 2024.

  4. arXiv:2405.18217  [pdf, other

    cs.LG

    Understanding Inter-Concept Relationships in Concept-Based Models

    Authors: Naveen Raman, Mateo Espinosa Zarlenga, Mateja Jamnik

    Abstract: Concept-based explainability methods provide insight into deep learning systems by constructing explanations using human-understandable concepts. While the literature on human reasoning demonstrates that we exploit relationships between concepts when solving tasks, it is unclear whether concept-based methods incorporate the rich structure of inter-concept relationships. We analyse the concept repr… ▽ More

    Submitted 28 May, 2024; originally announced May 2024.

    Comments: Accepted at ICML 2024

  5. arXiv:2403.15297  [pdf, other

    cs.AI

    Sphere Neural-Networks for Rational Reasoning

    Authors: Tiansi Dong, Mateja Jamnik, Pietro Liò

    Abstract: The success of Large Language Models (LLMs), e.g., ChatGPT, is witnessed by their planetary popularity, their capability of human-like communication, and also by their steadily improved reasoning performance. However, it remains unclear whether LLMs reason. It is an open problem how traditional neural networks can be qualitatively extended to go beyond the statistic paradigm and achieve high-level… ▽ More

    Submitted 24 June, 2024; v1 submitted 22 March, 2024; originally announced March 2024.

  6. arXiv:2401.01259  [pdf, other

    cs.LG cs.AI

    Do Concept Bottleneck Models Obey Locality?

    Authors: Naveen Raman, Mateo Espinosa Zarlenga, Juyeon Heo, Mateja Jamnik

    Abstract: Concept-based methods explain model predictions using human-understandable concepts. These models require accurate concept predictors, yet the faithfulness of existing concept predictors to their underlying concepts is unclear. In this paper, we investigate the faithfulness of Concept Bottleneck Models (CBMs), a popular family of concept-based architectures, by looking at whether they respect "loc… ▽ More

    Submitted 28 May, 2024; v1 submitted 2 January, 2024; originally announced January 2024.

    Comments: Previous Version Accepted at NeurIPs 23 XAI in Action Workshop

  7. arXiv:2311.09115  [pdf, other

    cs.LG cs.AI

    HEALNet -- Hybrid Multi-Modal Fusion for Heterogeneous Biomedical Data

    Authors: Konstantin Hemker, Nikola Simidjievski, Mateja Jamnik

    Abstract: Technological advances in medical data collection such as high-resolution histopathology and high-throughput genomic sequencing have contributed to the rising requirement for multi-modal biomedical modelling, specifically for image, tabular, and graph data. Most multi-modal deep learning approaches use modality-specific architectures that are trained separately and cannot capture the crucial cross… ▽ More

    Submitted 20 November, 2023; v1 submitted 15 November, 2023; originally announced November 2023.

    Comments: 7 pages body, 5 pages appendix

  8. arXiv:2311.03755  [pdf, other

    cs.CL cs.LG

    Multilingual Mathematical Autoformalization

    Authors: Albert Q. Jiang, Wenda Li, Mateja Jamnik

    Abstract: Autoformalization is the task of translating natural language materials into machine-verifiable formalisations. Progress in autoformalization research is hindered by the lack of a sizeable dataset consisting of informal-formal pairs expressing the same essence. Existing methods tend to circumvent this challenge by manually curating small corpora or using few-shot learning with large language model… ▽ More

    Submitted 9 November, 2023; v1 submitted 7 November, 2023; originally announced November 2023.

  9. arXiv:2309.16928  [pdf, other

    cs.LG cs.AI

    Learning to Receive Help: Intervention-Aware Concept Embedding Models

    Authors: Mateo Espinosa Zarlenga, Katherine M. Collins, Krishnamurthy Dvijotham, Adrian Weller, Zohreh Shams, Mateja Jamnik

    Abstract: Concept Bottleneck Models (CBMs) tackle the opacity of neural architectures by constructing and explaining their predictions using a set of high-level concepts. A special property of these models is that they permit concept interventions, wherein users can correct mispredicted concepts and thus improve the model's performance. Recent work, however, has shown that intervention efficacy can be highl… ▽ More

    Submitted 25 October, 2023; v1 submitted 28 September, 2023; originally announced September 2023.

    Comments: Accepted as a spotlight at the Thirty-seventh Conference on Neural Information Processing Systems (NeurIPS 2023)

  10. arXiv:2306.15661  [pdf, other

    cs.LG cs.AI

    Enhancing Representation Learning on High-Dimensional, Small-Size Tabular Data: A Divide and Conquer Method with Ensembled VAEs

    Authors: Navindu Leelarathna, Andrei Margeloiu, Mateja Jamnik, Nikola Simidjievski

    Abstract: Variational Autoencoders and their many variants have displayed impressive ability to perform dimensionality reduction, often achieving state-of-the-art performance. Many current methods however, struggle to learn good representations in High Dimensional, Low Sample Size (HDLSS) tasks, which is an inherently challenging setting. We address this challenge by using an ensemble of lightweight VAEs to… ▽ More

    Submitted 27 June, 2023; originally announced June 2023.

  11. arXiv:2306.12330  [pdf, other

    cs.LG

    ProtoGate: Prototype-based Neural Networks with Global-to-local Feature Selection for Tabular Biomedical Data

    Authors: Xiangjian Jiang, Andrei Margeloiu, Nikola Simidjievski, Mateja Jamnik

    Abstract: Tabular biomedical data poses challenges in machine learning because it is often high-dimensional and typically low-sample-size (HDLSS). Previous research has attempted to address these challenges via local feature selection, but existing approaches often fail to achieve optimal performance due to their limitation in identifying globally important features and their susceptibility to the co-adapta… ▽ More

    Submitted 3 June, 2024; v1 submitted 21 June, 2023; originally announced June 2023.

    Comments: Accepted by the Forty-first International Conference on Machine Learning (ICML2024)

  12. arXiv:2306.01694  [pdf, other

    cs.LG cs.HC

    Evaluating Language Models for Mathematics through Interactions

    Authors: Katherine M. Collins, Albert Q. Jiang, Simon Frieder, Lionel Wong, Miri Zilka, Umang Bhatt, Thomas Lukasiewicz, Yuhuai Wu, Joshua B. Tenenbaum, William Hart, Timothy Gowers, Wenda Li, Adrian Weller, Mateja Jamnik

    Abstract: There is much excitement about the opportunity to harness the power of large language models (LLMs) when building problem-solving assistants. However, the standard methodology of evaluating LLMs relies on static pairs of inputs and outputs, and is insufficient for making an informed decision about which LLMs and under which assistive settings can they be sensibly used. Static assessment fails to a… ▽ More

    Submitted 5 November, 2023; v1 submitted 2 June, 2023; originally announced June 2023.

  13. arXiv:2304.14068  [pdf, other

    cs.AI cs.LG cs.NE stat.ML

    Interpretable Neural-Symbolic Concept Reasoning

    Authors: Pietro Barbiero, Gabriele Ciravegna, Francesco Giannini, Mateo Espinosa Zarlenga, Lucie Charlotte Magister, Alberto Tonda, Pietro Lio', Frederic Precioso, Mateja Jamnik, Giuseppe Marra

    Abstract: Deep learning methods are highly accurate, yet their opaque decision process prevents them from earning full human trust. Concept-based models aim to address this issue by learning tasks based on a set of human-understandable concepts. However, state-of-the-art concept-based models rely on high-dimensional concept embedding representations which lack a clear semantic meaning, thus questioning the… ▽ More

    Submitted 22 May, 2023; v1 submitted 27 April, 2023; originally announced April 2023.

    Journal ref: Proceedings of the 40th International Conference on Machine Learning, PMLR 202:1801-1825, 2023

  14. arXiv:2304.05207  [pdf, other

    cs.LG cs.AI

    CGXplain: Rule-Based Deep Neural Network Explanations Using Dual Linear Programs

    Authors: Konstantin Hemker, Zohreh Shams, Mateja Jamnik

    Abstract: Rule-based surrogate models are an effective and interpretable way to approximate a Deep Neural Network's (DNN) decision boundaries, allowing humans to easily understand deep learning models. Current state-of-the-art decompositional methods, which are those that consider the DNN's latent space to extract more exact rule sets, manage to derive rule sets at high accuracy. However, they a) do not gua… ▽ More

    Submitted 11 April, 2023; originally announced April 2023.

    Comments: Accepted at ICLR 2023 Workshop on Trustworthy Machine Learning for Healthcare

  15. arXiv:2303.12872  [pdf, other

    cs.HC cs.AI cs.LG

    Human Uncertainty in Concept-Based AI Systems

    Authors: Katherine M. Collins, Matthew Barker, Mateo Espinosa Zarlenga, Naveen Raman, Umang Bhatt, Mateja Jamnik, Ilia Sucholutsky, Adrian Weller, Krishnamurthy Dvijotham

    Abstract: Placing a human in the loop may abate the risks of deploying AI systems in safety-critical settings (e.g., a clinician working with a medical AI system). However, mitigating risks arising from human error and uncertainty within such human-AI interactions is an important and understudied issue. In this work, we study human uncertainty in the context of concept-based models, a family of AI systems t… ▽ More

    Submitted 22 March, 2023; originally announced March 2023.

  16. arXiv:2302.04899  [pdf, other

    cs.LG

    GCI: A (G)raph (C)oncept (I)nterpretation Framework

    Authors: Dmitry Kazhdan, Botty Dimanov, Lucie Charlotte Magister, Pietro Barbiero, Mateja Jamnik, Pietro Lio

    Abstract: Explainable AI (XAI) underwent a recent surge in research on concept extraction, focusing on extracting human-interpretable concepts from Deep Neural Networks. An important challenge facing concept extraction approaches is the difficulty of interpreting and evaluating discovered concepts, especially for complex tasks such as molecular property prediction. We address this challenge by presenting GC… ▽ More

    Submitted 9 February, 2023; originally announced February 2023.

  17. Towards Robust Metrics for Concept Representation Evaluation

    Authors: Mateo Espinosa Zarlenga, Pietro Barbiero, Zohreh Shams, Dmitry Kazhdan, Umang Bhatt, Adrian Weller, Mateja Jamnik

    Abstract: Recent work on interpretability has focused on concept-based explanations, where deep learning models are explained in terms of high-level units of information, referred to as concepts. Concept learning models, however, have been shown to be prone to encoding impurities in their representations, failing to fully capture meaningful features of their inputs. While concept learning lacks metrics to m… ▽ More

    Submitted 24 January, 2023; originally announced January 2023.

    Comments: To appear at AAAI 2023

    MSC Class: 68T07 ACM Class: I.2.6

  18. arXiv:2211.15616  [pdf, other

    cs.LG cs.AI

    Weight Predictor Network with Feature Selection for Small Sample Tabular Biomedical Data

    Authors: Andrei Margeloiu, Nikola Simidjievski, Pietro Lio, Mateja Jamnik

    Abstract: Tabular biomedical data is often high-dimensional but with a very small number of samples. Although recent work showed that well-regularised simple neural networks could outperform more sophisticated architectures on tabular data, they are still prone to overfitting on tiny datasets with many potentially irrelevant features. To combat these issues, we propose Weight Predictor Network with Feature… ▽ More

    Submitted 28 November, 2022; originally announced November 2022.

    Comments: Accepted to AAAI-2023

  19. arXiv:2211.10830  [pdf, other

    cs.LG math.SG

    Discrete Lagrangian Neural Networks with Automatic Symmetry Discovery

    Authors: Yana Lishkova, Paul Scherer, Steffen Ridderbusch, Mateja Jamnik, Pietro Liò, Sina Ober-Blöbaum, Christian Offen

    Abstract: By one of the most fundamental principles in physics, a dynamical system will exhibit those motions which extremise an action functional. This leads to the formation of the Euler-Lagrange equations, which serve as a model of how the system will behave in time. If the dynamics exhibit additional symmetries, then the motion fulfils additional conservation laws, such as conservation of energy (time i… ▽ More

    Submitted 19 November, 2022; originally announced November 2022.

  20. arXiv:2211.07650  [pdf, other

    cs.LG cs.AI

    Explainer Divergence Scores (EDS): Some Post-Hoc Explanations May be Effective for Detecting Unknown Spurious Correlations

    Authors: Shea Cardozo, Gabriel Islas Montero, Dmitry Kazhdan, Botty Dimanov, Maleakhi Wijaya, Mateja Jamnik, Pietro Lio

    Abstract: Recent work has suggested post-hoc explainers might be ineffective for detecting spurious correlations in Deep Neural Networks (DNNs). However, we show there are serious weaknesses with the existing evaluation frameworks for this setting. Previously proposed metrics are extremely difficult to interpret and are not directly comparable between explainer methods. To alleviate these constraints, we pr… ▽ More

    Submitted 14 November, 2022; originally announced November 2022.

    Comments: Presented at the AIMLAI workshop at the 31st ACM International Conference on Information and Knowledge Management (CIKM 2022)

  21. arXiv:2211.06302  [pdf, other

    cs.LG q-bio.QM

    GCondNet: A Novel Method for Improving Neural Networks on Small High-Dimensional Tabular Data

    Authors: Andrei Margeloiu, Nikola Simidjievski, Pietro Lio, Mateja Jamnik

    Abstract: Neural networks often struggle with high-dimensional but small sample-size tabular datasets. One reason is that current weight initialisation methods assume independence between weights, which can be problematic when there are insufficient samples to estimate the model's parameters accurately. In such small data scenarios, leveraging additional structures can improve the model's performance and tr… ▽ More

    Submitted 17 August, 2024; v1 submitted 11 November, 2022; originally announced November 2022.

    Comments: Published in Transactions on Machine Learning Research (TMLR) 2024. Also accepted and selected for oral presentation at NeurIPS 2023 - Table Representation Learning Workshop

  22. arXiv:2210.12283  [pdf, other

    cs.AI cs.LG

    Draft, Sketch, and Prove: Guiding Formal Theorem Provers with Informal Proofs

    Authors: Albert Q. Jiang, Sean Welleck, Jin Peng Zhou, Wenda Li, Jiacheng Liu, Mateja Jamnik, Timothée Lacroix, Yuhuai Wu, Guillaume Lample

    Abstract: The formalization of existing mathematical proofs is a notoriously difficult process. Despite decades of research on automation and proof assistants, writing formal proofs remains arduous and only accessible to a few experts. While previous studies to automate formalization focused on powerful search algorithms, no attempts were made to take advantage of available informal proofs. In this work, we… ▽ More

    Submitted 20 February, 2023; v1 submitted 21 October, 2022; originally announced October 2022.

  23. arXiv:2209.09383  [pdf, other

    cs.LG q-bio.QM

    Distributed representations of graphs for drug pair scoring

    Authors: Paul Scherer, Pietro Liò, Mateja Jamnik

    Abstract: In this paper we study the practicality and usefulness of incorporating distributed representations of graphs into models within the context of drug pair scoring. We argue that the real world growth and update cycles of drug pair scoring datasets subvert the limitations of transductive learning associated with distributed representations. Furthermore, we argue that the vocabulary of discrete subst… ▽ More

    Submitted 24 November, 2022; v1 submitted 19 September, 2022; originally announced September 2022.

    Comments: Updated manuscript, 9 main pages, 8 pages reference and appendix

  24. arXiv:2209.09056  [pdf, other

    cs.LG cs.AI

    Concept Embedding Models: Beyond the Accuracy-Explainability Trade-Off

    Authors: Mateo Espinosa Zarlenga, Pietro Barbiero, Gabriele Ciravegna, Giuseppe Marra, Francesco Giannini, Michelangelo Diligenti, Zohreh Shams, Frederic Precioso, Stefano Melacci, Adrian Weller, Pietro Lio, Mateja Jamnik

    Abstract: Deploying AI-powered systems requires trustworthy models supporting effective human interactions, going beyond raw prediction accuracy. Concept bottleneck models promote trustworthiness by conditioning classification tasks on an intermediate level of human-like concepts. This enables human interventions which can correct mispredicted concepts to improve the model's performance. However, existing c… ▽ More

    Submitted 5 December, 2022; v1 submitted 19 September, 2022; originally announced September 2022.

    Comments: To appear at NeurIPS 2022

    Report number: 35 MSC Class: 68T07 ACM Class: I.2.6

    Journal ref: https://proceedings.neurips.cc/paper_files/paper/2022/hash/867c06823281e506e8059f5c13a57f75-Abstract-Conference.html

  25. arXiv:2207.13586  [pdf, other

    cs.LG cs.AI cs.LO

    Encoding Concepts in Graph Neural Networks

    Authors: Lucie Charlotte Magister, Pietro Barbiero, Dmitry Kazhdan, Federico Siciliano, Gabriele Ciravegna, Fabrizio Silvestri, Mateja Jamnik, Pietro Lio

    Abstract: The opaque reasoning of Graph Neural Networks induces a lack of human trust. Existing graph network explainers attempt to address this issue by providing post-hoc explanations, however, they fail to make the model itself more interpretable. To fill this gap, we introduce the Concept Encoder Module, the first differentiable concept-discovery approach for graph networks. The proposed approach makes… ▽ More

    Submitted 7 August, 2022; v1 submitted 27 July, 2022; originally announced July 2022.

  26. arXiv:2206.03172  [pdf, other

    cs.AI cs.LO

    Representational Systems Theory: A Unified Approach to Encoding, Analysing and Transforming Representations

    Authors: Daniel Raggi, Gem Stapleton, Mateja Jamnik, Aaron Stockdill, Grecia Garcia Garcia, Peter C-H. Cheng

    Abstract: The study of representations is of fundamental importance to any form of communication, and our ability to exploit them effectively is paramount. This article presents a novel theory -- Representational Systems Theory -- that is designed to abstractly encode a wide variety of representations from three core perspectives: syntax, entailment, and their properties. By introducing the concept of a con… ▽ More

    Submitted 7 June, 2022; originally announced June 2022.

    Comments: 118 pages total: 94 of main paper + 2 of references + 22 of appendices. Submitted to JACM. Authors Gem Stapleton and Daniel Raggi contributed equally to this research

    MSC Class: 68T30 (primary); 68T27; 68T35; 68T01; 68T15 ACM Class: F.4; I.2.4; I.2.3; F.4.3; D.3.1

  27. arXiv:2205.12615  [pdf, ps, other

    cs.LG cs.AI cs.LO cs.SE

    Autoformalization with Large Language Models

    Authors: Yuhuai Wu, Albert Q. Jiang, Wenda Li, Markus N. Rabe, Charles Staats, Mateja Jamnik, Christian Szegedy

    Abstract: Autoformalization is the process of automatically translating from natural language mathematics to formal specifications and proofs. A successful autoformalization system could advance the fields of formal verification, program synthesis, and artificial intelligence. While the long-term goal of autoformalization seemed elusive for a long time, we show large language models provide new prospects to… ▽ More

    Submitted 25 May, 2022; originally announced May 2022.

    Comments: 44 pages

  28. arXiv:2205.10893  [pdf, other

    cs.AI

    Thor: Wielding Hammers to Integrate Language Models and Automated Theorem Provers

    Authors: Albert Q. Jiang, Wenda Li, Szymon Tworkowski, Konrad Czechowski, Tomasz Odrzygóźdź, Piotr Miłoś, Yuhuai Wu, Mateja Jamnik

    Abstract: In theorem proving, the task of selecting useful premises from a large library to unlock the proof of a given conjecture is crucially important. This presents a challenge for all theorem provers, especially the ones based on language models, due to their relative inability to reason over huge volumes of premises in text form. This paper introduces Thor, a framework integrating language models and… ▽ More

    Submitted 22 May, 2022; originally announced May 2022.

  29. arXiv:2111.12628  [pdf, other

    cs.LG cs.AI

    Efficient Decompositional Rule Extraction for Deep Neural Networks

    Authors: Mateo Espinosa Zarlenga, Zohreh Shams, Mateja Jamnik

    Abstract: In recent years, there has been significant work on increasing both interpretability and debuggability of a Deep Neural Network (DNN) by extracting a rule-based model that approximates its decision boundary. Nevertheless, current DNN rule extraction methods that consider a DNN's latent space when extracting rules, known as decompositional algorithms, are either restricted to single-layer DNNs or i… ▽ More

    Submitted 24 November, 2021; originally announced November 2021.

    Comments: Accepted at NeurIPS 2021 Workshop on eXplainable AI approaches for debugging and diagnosis (XAI4Debugging)

  30. arXiv:2105.04289  [pdf, other

    cs.LG cs.AI

    Do Concept Bottleneck Models Learn as Intended?

    Authors: Andrei Margeloiu, Matthew Ashman, Umang Bhatt, Yanzhi Chen, Mateja Jamnik, Adrian Weller

    Abstract: Concept bottleneck models map from raw inputs to concepts, and then from concepts to targets. Such models aim to incorporate pre-specified, high-level concepts into the learning procedure, and have been motivated to meet three desiderata: interpretability, predictability, and intervenability. However, we find that concept bottleneck models struggle to meet these goals. Using post hoc interpretabil… ▽ More

    Submitted 10 May, 2021; originally announced May 2021.

    Comments: Accepted at ICLR 2021 Workshop on Responsible AI

  31. arXiv:2104.08952  [pdf, other

    cs.LG

    Failing Conceptually: Concept-Based Explanations of Dataset Shift

    Authors: Maleakhi A. Wijaya, Dmitry Kazhdan, Botty Dimanov, Mateja Jamnik

    Abstract: Despite their remarkable performance on a wide range of visual tasks, machine learning technologies often succumb to data distribution shifts. Consequently, a range of recent work explores techniques for detecting these shifts. Unfortunately, current techniques offer no explanations about what triggers the detection of shifts, thus limiting their utility to provide actionable insights. In this wor… ▽ More

    Submitted 1 May, 2021; v1 submitted 18 April, 2021; originally announced April 2021.

    Comments: ICLR 2021 Workshop (RobustML), 16 pages, 14 figures; typos corrected

  32. arXiv:2104.06917  [pdf, other

    cs.LG cs.AI

    Is Disentanglement all you need? Comparing Concept-based & Disentanglement Approaches

    Authors: Dmitry Kazhdan, Botty Dimanov, Helena Andres Terre, Mateja Jamnik, Pietro Liò, Adrian Weller

    Abstract: Concept-based explanations have emerged as a popular way of extracting human-interpretable representations from deep discriminative models. At the same time, the disentanglement learning literature has focused on extracting similar representations in an unsupervised or weakly-supervised way, using deep generative models. Despite the overlapping goals and potential synergies, to our knowledge, ther… ▽ More

    Submitted 14 April, 2021; originally announced April 2021.

    Comments: Presented at the RAI, WeaSul, and RobustML workshops at The Ninth International Conference on Learning Representations (ICLR) 2021

  33. arXiv:2012.06954  [pdf, other

    cs.LG cs.AI

    MEME: Generating RNN Model Explanations via Model Extraction

    Authors: Dmitry Kazhdan, Botty Dimanov, Mateja Jamnik, Pietro Liò

    Abstract: Recurrent Neural Networks (RNNs) have achieved remarkable performance on a range of tasks. A key step to further empowering RNN-based approaches is improving their explainability and interpretability. In this work we present MEME: a model extraction approach capable of approximating RNNs with interpretable models represented by human-understandable concepts and their interactions. We demonstrate h… ▽ More

    Submitted 12 December, 2020; originally announced December 2020.

    Comments: Presented at the HAMLETS workshop at the 34th Conference on Neural Information Processing Systems (NeurIPS 2020)

  34. arXiv:2012.01166  [pdf, other

    cs.LG

    Improving Interpretability in Medical Imaging Diagnosis using Adversarial Training

    Authors: Andrei Margeloiu, Nikola Simidjievski, Mateja Jamnik, Adrian Weller

    Abstract: We investigate the influence of adversarial training on the interpretability of convolutional neural networks (CNNs), specifically applied to diagnosing skin cancer. We show that gradient-based saliency maps of adversarially trained CNNs are significantly sharper and more visually coherent than those of standardly trained CNNs. Furthermore, we show that adversarially trained networks highlight reg… ▽ More

    Submitted 2 December, 2020; originally announced December 2020.

    Comments: To appear at NeurIPS 2020 workshop "Medical Imaging meets NeurIPS (MED-NEURIPS)"

  35. arXiv:2011.10998  [pdf, other

    q-bio.GN cs.LG

    Using ontology embeddings for structural inductive bias in gene expression data analysis

    Authors: Maja Trębacz, Zohreh Shams, Mateja Jamnik, Paul Scherer, Nikola Simidjievski, Helena Andres Terre, Pietro Liò

    Abstract: Stratifying cancer patients based on their gene expression levels allows improving diagnosis, survival analysis and treatment planning. However, such data is extremely highly dimensional as it contains expression values for over 20000 genes per patient, and the number of samples in the datasets is low. To deal with such settings, we propose to incorporate prior biological knowledge about genes fro… ▽ More

    Submitted 22 November, 2020; originally announced November 2020.

    Comments: 4 pages + 2 page references, 15th Machine Learning in Computational Biology (MLCB) meeting, 2020

  36. arXiv:2011.01306  [pdf, other

    cs.AI

    Pairwise Relations Discriminator for Unsupervised Raven's Progressive Matrices

    Authors: Nicholas Quek Wei Kiat, Duo Wang, Mateja Jamnik

    Abstract: The ability to hypothesise, develop abstract concepts based on concrete observations and apply these hypotheses to justify future actions has been paramount in human development. An existing line of research in outfitting intelligent machines with abstract reasoning capabilities revolves around the Raven's Progressive Matrices (RPM). There have been many breakthroughs in supervised approaches to s… ▽ More

    Submitted 5 August, 2021; v1 submitted 2 November, 2020; originally announced November 2020.

  37. arXiv:2010.13233  [pdf, other

    cs.LG

    Now You See Me (CME): Concept-based Model Extraction

    Authors: Dmitry Kazhdan, Botty Dimanov, Mateja Jamnik, Pietro Liò, Adrian Weller

    Abstract: Deep Neural Networks (DNNs) have achieved remarkable performance on a range of tasks. A key step to further empowering DNN-based approaches is improving their explainability. In this work we present CME: a concept-based model extraction framework, used for analysing DNN models via concept-based extracted models. Using two case studies (dSprites, and Caltech UCSD Birds), we demonstrate how CME can… ▽ More

    Submitted 25 October, 2020; originally announced October 2020.

    Comments: Presented at the AIMLAI workshop at the 29th ACM International Conference on Information and Knowledge Management (CIKM 2020)

  38. arXiv:2010.00387  [pdf, other

    q-bio.MN cs.LG cs.SI stat.ML

    Incorporating network based protein complex discovery into automated model construction

    Authors: Paul Scherer, Maja Trȩbacz, Nikola Simidjievski, Zohreh Shams, Helena Andres Terre, Pietro Liò, Mateja Jamnik

    Abstract: We propose a method for gene expression based analysis of cancer phenotypes incorporating network biology knowledge through unsupervised construction of computational graphs. The structural construction of the computational graphs is driven by the use of topological clustering algorithms on protein-protein networks which incorporate inductive biases stemming from network biology research in protei… ▽ More

    Submitted 29 September, 2020; originally announced October 2020.

    Comments: 7 Pages, 2 Figures

  39. arXiv:2009.09232  [pdf, other

    cs.LG cs.NE

    Learned Low Precision Graph Neural Networks

    Authors: Yiren Zhao, Duo Wang, Daniel Bates, Robert Mullins, Mateja Jamnik, Pietro Lio

    Abstract: Deep Graph Neural Networks (GNNs) show promising performance on a range of graph tasks, yet at present are costly to run and lack many of the optimisations applied to DNNs. We show, for the first time, how to systematically quantise GNNs with minimal or no loss in performance using Network Architecture Search (NAS). We define the possible quantisation search space of GNNs. The proposed novel NAS m… ▽ More

    Submitted 19 September, 2020; originally announced September 2020.

  40. arXiv:2006.11197  [pdf, other

    cs.LG stat.ML

    Abstract Diagrammatic Reasoning with Multiplex Graph Networks

    Authors: Duo Wang, Mateja Jamnik, Pietro Lio

    Abstract: Abstract reasoning, particularly in the visual domain, is a complex human ability, but it remains a challenging problem for artificial neural learning systems. In this work we propose MXGNet, a multilayer graph neural network for multi-panel diagrammatic reasoning tasks. MXGNet combines three powerful concepts, namely, object-level representation, graph neural networks and multiplex graphs, for so… ▽ More

    Submitted 19 June, 2020; originally announced June 2020.

  41. arXiv:2006.08698  [pdf, other

    cs.LG stat.ML

    Extrapolatable Relational Reasoning With Comparators in Low-Dimensional Manifolds

    Authors: Duo Wang, Mateja Jamnik, Pietro Lio

    Abstract: While modern deep neural architectures generalise well when test data is sampled from the same distribution as training data, they fail badly for cases when the test data distribution differs from the training distribution even along a few dimensions. This lack of out-of-distribution generalisation is increasingly manifested when the tasks become more abstract and complex, such as in relational re… ▽ More

    Submitted 3 October, 2020; v1 submitted 15 June, 2020; originally announced June 2020.

  42. arXiv:2003.09676  [pdf, other

    cs.LG stat.ML

    Probabilistic Dual Network Architecture Search on Graphs

    Authors: Yiren Zhao, Duo Wang, Xitong Gao, Robert Mullins, Pietro Lio, Mateja Jamnik

    Abstract: We present the first differentiable Network Architecture Search (NAS) for Graph Neural Networks (GNNs). GNNs show promising performance on a wide range of tasks, but require a large amount of architecture engineering. First, graphs are inherently a non-Euclidean and sophisticated data structure, leading to poor adaptivity of GNN architectures across different datasets. Second, a typical graph bloc… ▽ More

    Submitted 21 March, 2020; originally announced March 2020.

  43. arXiv:2002.01335  [pdf, other

    cs.CL cs.AI cs.LG cs.MA stat.ML

    Structural Inductive Biases in Emergent Communication

    Authors: Agnieszka Słowik, Abhinav Gupta, William L. Hamilton, Mateja Jamnik, Sean B. Holden, Christopher Pal

    Abstract: In order to communicate, humans flatten a complex representation of ideas and their attributes into a single word or a sentence. We investigate the impact of representation learning in artificial agents by developing graph referential games. We empirically show that agents parametrized by graph neural networks develop a more compositional language compared to bag-of-words and sequence models, whic… ▽ More

    Submitted 27 July, 2021; v1 submitted 4 February, 2020; originally announced February 2020.

    Comments: The first two authors contributed equally. Poster presented at CogSci 2021

  44. arXiv:2001.09063  [pdf, other

    cs.LG cs.AI cs.CL cs.MA stat.ML

    Towards Graph Representation Learning in Emergent Communication

    Authors: Agnieszka Słowik, Abhinav Gupta, William L. Hamilton, Mateja Jamnik, Sean B. Holden

    Abstract: Recent findings in neuroscience suggest that the human brain represents information in a geometric structure (for instance, through conceptual spaces). In order to communicate, we flatten the complex representation of entities and their attributes into a single word or a sentence. In this paper we use graph convolutional networks to support the evolution of language and cooperation in multi-agent… ▽ More

    Submitted 4 February, 2020; v1 submitted 24 January, 2020; originally announced January 2020.

    Comments: The first two authors contributed equally. Accepted at the Reinforcement Learning in Games workshop at AAAI 2020

  45. arXiv:1910.08589  [pdf, ps, other

    cs.LG stat.ML

    Decoupling feature propagation from the design of graph auto-encoders

    Authors: Paul Scherer, Helena Andres-Terre, Pietro Lio, Mateja Jamnik

    Abstract: We present two instances, L-GAE and L-VGAE, of the variational graph auto-encoding family (VGAE) based on separating feature propagation operations from graph convolution layers typically found in graph learning methods to a single linear matrix computation made prior to input in standard auto-encoder architectures. This decoupling enables the independent and fixed design of the auto-encoder witho… ▽ More

    Submitted 18 October, 2019; originally announced October 2019.

    Comments: 4 pages (considering single anonymous naming during original submission, now a few lines over 4). Originally submitted to NeurIPS 2019 Graph Representation Learning Workshop

  46. arXiv:1909.09137  [pdf, other

    cs.AI cs.LG

    Bayesian Optimisation with Gaussian Processes for Premise Selection

    Authors: Agnieszka Słowik, Chaitanya Mangla, Mateja Jamnik, Sean B. Holden, Lawrence C. Paulson

    Abstract: Heuristics in theorem provers are often parameterised. Modern theorem provers such as Vampire utilise a wide array of heuristics to control the search space explosion, thereby requiring optimisation of a large set of parameters. An exhaustive search in this multi-dimensional parameter space is intractable in most cases, yet the performance of the provers is highly dependent on the parameter assign… ▽ More

    Submitted 18 September, 2019; originally announced September 2019.

  47. arXiv:1903.06581  [pdf, other

    cs.CV cs.LG stat.ML

    Unsupervised and interpretable scene discovery with Discrete-Attend-Infer-Repeat

    Authors: Duo Wang, Mateja Jamnik, Pietro Lio

    Abstract: In this work we present Discrete Attend Infer Repeat (Discrete-AIR), a Recurrent Auto-Encoder with structured latent distributions containing discrete categorical distributions, continuous attribute distributions, and factorised spatial attention. While inspired by the original AIR model andretaining AIR model's capability in identifying objects in an image, Discrete-AIR provides direct interpreta… ▽ More

    Submitted 14 March, 2019; originally announced March 2019.

  48. Tactical Diagrammatic Reasoning

    Authors: Sven Linker, Jim Burton, Mateja Jamnik

    Abstract: Although automated reasoning with diagrams has been possible for some years, tools for diagrammatic reasoning are generally much less sophisticated than their sentential cousins. The tasks of exploring levels of automation and abstraction in the construction of proofs and of providing explanations of solutions expressed in the proofs remain to be addressed. In this paper we take an interactive pro… ▽ More

    Submitted 24 January, 2017; originally announced January 2017.

    Comments: In Proceedings UITP 2016, arXiv:1701.06745

    Journal ref: EPTCS 239, 2017, pp. 29-42