Search | arXiv e-print repository

Landscape-Aware Automated Algorithm Configuration using Multi-output Mixed Regression and Classification

Authors: Fu Xing Long, Moritz Frenzel, Peter Krause, Markus Gitterle, Thomas Bäck, Niki van Stein

Abstract: In landscape-aware algorithm selection problem, the effectiveness of feature-based predictive models strongly depends on the representativeness of training data for practical applications. In this work, we investigate the potential of randomly generated functions (RGF) for the model training, which cover a much more diverse set of optimization problem classes compared to the widely-used black-box… ▽ More In landscape-aware algorithm selection problem, the effectiveness of feature-based predictive models strongly depends on the representativeness of training data for practical applications. In this work, we investigate the potential of randomly generated functions (RGF) for the model training, which cover a much more diverse set of optimization problem classes compared to the widely-used black-box optimization benchmarking (BBOB) suite. Correspondingly, we focus on automated algorithm configuration (AAC), that is, selecting the best suited algorithm and fine-tuning its hyperparameters based on the landscape features of problem instances. Precisely, we analyze the performance of dense neural network (NN) models in handling the multi-output mixed regression and classification tasks using different training data sets, such as RGF and many-affine BBOB (MA-BBOB) functions. Based on our results on the BBOB functions in 5d and 20d, near optimal configurations can be identified using the proposed approach, which can most of the time outperform the off-the-shelf default configuration considered by practitioners with limited knowledge about AAC. Furthermore, the predicted configurations are competitive against the single best solver in many cases. Overall, configurations with better performance can be best identified by using NN models trained on a combination of RGF and MA-BBOB functions. △ Less

Submitted 2 September, 2024; originally announced September 2024.

arXiv:2407.12950 [pdf, other]

doi 10.1007/978-3-031-63787-2_16

Beyond the Veil of Similarity: Quantifying Semantic Continuity in Explainable AI

Authors: Qi Huang, Emanuele Mezzi, Osman Mutlu, Miltiadis Kofinas, Vidya Prasad, Shadnan Azwad Khan, Elena Ranguelova, Niki van Stein

Abstract: We introduce a novel metric for measuring semantic continuity in Explainable AI methods and machine learning models. We posit that for models to be truly interpretable and trustworthy, similar inputs should yield similar explanations, reflecting a consistent semantic understanding. By leveraging XAI techniques, we assess semantic continuity in the task of image recognition. We conduct experiments… ▽ More We introduce a novel metric for measuring semantic continuity in Explainable AI methods and machine learning models. We posit that for models to be truly interpretable and trustworthy, similar inputs should yield similar explanations, reflecting a consistent semantic understanding. By leveraging XAI techniques, we assess semantic continuity in the task of image recognition. We conduct experiments to observe how incremental changes in input affect the explanations provided by different XAI methods. Through this approach, we aim to evaluate the models' capability to generalize and abstract semantic concepts accurately and to evaluate different XAI methods in correctly capturing the model behaviour. This paper contributes to the broader discourse on AI interpretability by proposing a quantitative measure for semantic continuity for XAI methods, offering insights into the models' and explainers' internal reasoning processes, and promoting more reliable and transparent AI systems. △ Less

Submitted 17 July, 2024; originally announced July 2024.

Comments: 25 pages, accepted at the world conference of explainable AI, 2024, Malta

arXiv:2407.11511 [pdf, other]

Reasoning with Large Language Models, a Survey

Authors: Aske Plaat, Annie Wong, Suzan Verberne, Joost Broekens, Niki van Stein, Thomas Back

Abstract: Scaling up language models to billions of parameters has opened up possibilities for in-context learning, allowing instruction tuning and few-shot learning on tasks that the model was not specifically trained for. This has achieved breakthrough performance on language tasks such as translation, summarization, and question-answering. Furthermore, in addition to these associative "System 1" tasks, r… ▽ More Scaling up language models to billions of parameters has opened up possibilities for in-context learning, allowing instruction tuning and few-shot learning on tasks that the model was not specifically trained for. This has achieved breakthrough performance on language tasks such as translation, summarization, and question-answering. Furthermore, in addition to these associative "System 1" tasks, recent advances in Chain-of-thought prompt learning have demonstrated strong "System 2" reasoning abilities, answering a question in the field of artificial general intelligence whether LLMs can reason. The field started with the question whether LLMs can solve grade school math word problems. This paper reviews the rapidly expanding field of prompt-based reasoning with LLMs. Our taxonomy identifies different ways to generate, evaluate, and control multi-step reasoning. We provide an in-depth coverage of core approaches and open problems, and we propose a research agenda for the near future. Finally, we highlight the relation between reasoning and prompt-based learning, and we discuss the relation between reasoning, sequential decision processes, and reinforcement learning. We find that self-improvement, self-reflection, and some metacognitive abilities of the reasoning processes are possible through the judicious use of prompts. True self-improvement and self-reasoning, to go from reasoning with LLMs to reasoning by LLMs, remains future work. △ Less

Submitted 16 July, 2024; originally announced July 2024.

arXiv:2406.07811 [pdf, other]

Evolutionary Computation and Explainable AI: A Roadmap to Transparent Intelligent Systems

Authors: Ryan Zhou, Jaume Bacardit, Alexander Brownlee, Stefano Cagnoni, Martin Fyvie, Giovanni Iacca, John McCall, Niki van Stein, David Walker, Ting Hu

Abstract: AI methods are finding an increasing number of applications, but their often black-box nature has raised concerns about accountability and trust. The field of explainable artificial intelligence (XAI) has emerged in response to the need for human understanding of AI models. Evolutionary computation (EC), as a family of powerful optimization and learning tools, has significant potential to contribu… ▽ More AI methods are finding an increasing number of applications, but their often black-box nature has raised concerns about accountability and trust. The field of explainable artificial intelligence (XAI) has emerged in response to the need for human understanding of AI models. Evolutionary computation (EC), as a family of powerful optimization and learning tools, has significant potential to contribute to XAI. In this paper, we provide an introduction to XAI and review various techniques in current use for explaining machine learning (ML) models. We then focus on how EC can be used in XAI, and review some XAI approaches which incorporate EC techniques. Additionally, we discuss the application of XAI principles within EC itself, examining how these principles can shed some light on the behavior and outcomes of EC algorithms in general, on the (automatic) configuration of these algorithms, and on the underlying problem landscapes that these algorithms optimize. Finally, we discuss some open challenges in XAI and opportunities for future research in this field using EC. Our aim is to demonstrate that EC is well-suited for addressing current problems in explainability and to encourage further exploration of these methods to contribute to the development of more transparent and trustworthy ML models and EC algorithms. △ Less

Submitted 11 June, 2024; originally announced June 2024.

Comments: 29 pages, 4 figures. arXiv admin note: substantial text overlap with arXiv:2306.14786

arXiv:2406.06629 [pdf, other]

A Survey of Meta-features Used for Automated Selection of Algorithms for Black-box Single-objective Continuous Optimization

Authors: Gjorgjina Cenikj, Ana Nikolikj, Gašper Petelin, Niki van Stein, Carola Doerr, Tome Eftimov

Abstract: The selection of the most appropriate algorithm to solve a given problem instance, known as algorithm selection, is driven by the potential to capitalize on the complementary performance of different algorithms across sets of problem instances. However, determining the optimal algorithm for an unseen problem instance has been shown to be a challenging task, which has garnered significant attention… ▽ More The selection of the most appropriate algorithm to solve a given problem instance, known as algorithm selection, is driven by the potential to capitalize on the complementary performance of different algorithms across sets of problem instances. However, determining the optimal algorithm for an unseen problem instance has been shown to be a challenging task, which has garnered significant attention from researchers in recent years. In this survey, we conduct an overview of the key contributions to algorithm selection in the field of single-objective continuous black-box optimization. We present ongoing work in representation learning of meta-features for optimization problem instances, algorithm instances, and their interactions. We also study machine learning models for automated algorithm selection, configuration, and performance prediction. Through this analysis, we identify gaps in the state of the art, based on which we present ideas for further development of meta-feature representations. △ Less

Submitted 8 June, 2024; originally announced June 2024.

Comments: 14 pages, 2 figures

MSC Class: 68W50 (Primary) 68T30 (Secondary) ACM Class: F.2.1; I.2.4

arXiv:2405.20132 [pdf, other]

LLaMEA: A Large Language Model Evolutionary Algorithm for Automatically Generating Metaheuristics

Authors: Niki van Stein, Thomas Bäck

Abstract: Large Language Models (LLMs) such as GPT-4 have demonstrated their ability to understand natural language and generate complex code snippets. This paper introduces a novel Large Language Model Evolutionary Algorithm (LLaMEA) framework, leveraging GPT models for the automated generation and refinement of algorithms. Given a set of criteria and a task definition (the search space), LLaMEA iterativel… ▽ More Large Language Models (LLMs) such as GPT-4 have demonstrated their ability to understand natural language and generate complex code snippets. This paper introduces a novel Large Language Model Evolutionary Algorithm (LLaMEA) framework, leveraging GPT models for the automated generation and refinement of algorithms. Given a set of criteria and a task definition (the search space), LLaMEA iteratively generates, mutates and selects algorithms based on performance metrics and feedback from runtime evaluations. This framework offers a unique approach to generating optimized algorithms without requiring extensive prior expertise. We show how this framework can be used to generate novel black-box metaheuristic optimization algorithms automatically. LLaMEA generates multiple algorithms that outperform state-of-the-art optimization algorithms (Covariance Matrix Adaptation Evolution Strategy and Differential Evolution) on the five dimensional black box optimization benchmark (BBOB). The algorithms also show competitive performance on the 10- and 20-dimensional instances of the test functions, although they have not seen such instances during the automated generation process. The results demonstrate the feasibility of the framework and identify future directions for automated generation and optimization of algorithms via LLMs. △ Less

Submitted 20 August, 2024; v1 submitted 30 May, 2024; originally announced May 2024.

Comments: Submitted to IEEE TEVC

arXiv:2405.10271 [pdf, other]

Automated Federated Learning via Informed Pruning

Authors: Christian Internò, Elena Raponi, Niki van Stein, Thomas Bäck, Markus Olhofer, Yaochu Jin, Barbara Hammer

Abstract: Federated learning (FL) represents a pivotal shift in machine learning (ML) as it enables collaborative training of local ML models coordinated by a central aggregator, all without the need to exchange local data. However, its application on edge devices is hindered by limited computational capabilities and data communication challenges, compounded by the inherent complexity of Deep Learning (DL)… ▽ More Federated learning (FL) represents a pivotal shift in machine learning (ML) as it enables collaborative training of local ML models coordinated by a central aggregator, all without the need to exchange local data. However, its application on edge devices is hindered by limited computational capabilities and data communication challenges, compounded by the inherent complexity of Deep Learning (DL) models. Model pruning is identified as a key technique for compressing DL models on devices with limited resources. Nonetheless, conventional pruning techniques typically rely on manually crafted heuristics and demand human expertise to achieve a balance between model size, speed, and accuracy, often resulting in sub-optimal solutions. In this study, we introduce an automated federated learning approach utilizing informed pruning, called AutoFLIP, which dynamically prunes and compresses DL models within both the local clients and the global server. It leverages a federated loss exploration phase to investigate model gradient behavior across diverse datasets and losses, providing insights into parameter significance. Our experiments showcase notable enhancements in scenarios with strong non-IID data, underscoring AutoFLIP's capacity to tackle computational constraints and achieve superior global convergence. △ Less

Submitted 16 May, 2024; originally announced May 2024.

arXiv:2404.17323 [pdf, other]

A Deep Dive into Effects of Structural Bias on CMA-ES Performance along Affine Trajectories

Authors: Niki van Stein, Sarah L. Thomson, Anna V. Kononova

Abstract: To guide the design of better iterative optimisation heuristics, it is imperative to understand how inherent structural biases within algorithm components affect the performance on a wide variety of search landscapes. This study explores the impact of structural bias in the modular Covariance Matrix Adaptation Evolution Strategy (modCMA), focusing on the roles of various modulars within the algori… ▽ More To guide the design of better iterative optimisation heuristics, it is imperative to understand how inherent structural biases within algorithm components affect the performance on a wide variety of search landscapes. This study explores the impact of structural bias in the modular Covariance Matrix Adaptation Evolution Strategy (modCMA), focusing on the roles of various modulars within the algorithm. Through an extensive investigation involving 435,456 configurations of modCMA, we identified key modules that significantly influence structural bias of various classes. Our analysis utilized the Deep-BIAS toolbox for structural bias detection and classification, complemented by SHAP analysis for quantifying module contributions. The performance of these configurations was tested on a sequence of affine-recombined functions, maintaining fixed optimum locations while gradually varying the landscape features. Our results demonstrate an interplay between module-induced structural bias and algorithm performance across different landscape characteristics. △ Less

Submitted 26 April, 2024; originally announced April 2024.

Comments: 15 pages, 5 figures, submitted to PPSN 2024

arXiv:2402.06299 [pdf, other]

A Functional Analysis Approach to Symbolic Regression

Authors: Kirill Antonov, Roman Kalkreuth, Kaifeng Yang, Thomas Bäck, Niki van Stein, Anna V Kononova

Abstract: Symbolic regression (SR) poses a significant challenge for randomized search heuristics due to its reliance on the synthesis of expressions for input-output mappings. Although traditional genetic programming (GP) algorithms have achieved success in various domains, they exhibit limited performance when tree-based representations are used for SR. To address these limitations, we introduce a novel S… ▽ More Symbolic regression (SR) poses a significant challenge for randomized search heuristics due to its reliance on the synthesis of expressions for input-output mappings. Although traditional genetic programming (GP) algorithms have achieved success in various domains, they exhibit limited performance when tree-based representations are used for SR. To address these limitations, we introduce a novel SR approach called Fourier Tree Growing (FTG) that draws insights from functional analysis. This new perspective enables us to perform optimization directly in a different space, thus avoiding intricate symbolic expressions. Our proposed algorithm exhibits significant performance improvements over traditional GP methods on a range of classical one-dimensional benchmarking problems. To identify and explain limiting factors of GP and FTG, we perform experiments on a large-scale polynomials benchmark with high-order polynomials up to degree 100. To the best of the authors' knowledge, this work represents the pioneering application of functional analysis in addressing SR problems. The superior performance of the proposed algorithm and insights into the limitations of GP open the way for further advancing GP for SR and related areas of explainable machine learning. △ Less

Submitted 9 February, 2024; originally announced February 2024.

Comments: 14 pages, 3 figures. Submitted to Genetic and Evolutionary Computation Conference (GECCO-2024)

arXiv:2402.01343 [pdf, other]

Shapelet-based Model-agnostic Counterfactual Local Explanations for Time Series Classification

Authors: Qi Huang, Wei Chen, Thomas Bäck, Niki van Stein

Abstract: In this work, we propose a model-agnostic instance-based post-hoc explainability method for time series classification. The proposed algorithm, namely Time-CF, leverages shapelets and TimeGAN to provide counterfactual explanations for arbitrary time series classifiers. We validate the proposed method on several real-world univariate time series classification tasks from the UCR Time Series Archive… ▽ More In this work, we propose a model-agnostic instance-based post-hoc explainability method for time series classification. The proposed algorithm, namely Time-CF, leverages shapelets and TimeGAN to provide counterfactual explanations for arbitrary time series classifiers. We validate the proposed method on several real-world univariate time series classification tasks from the UCR Time Series Archive. The results indicate that the counterfactual instances generated by Time-CF when compared to state-of-the-art methods, demonstrate better performance in terms of four explainability metrics: closeness, sensibility, plausibility, and sparsity. △ Less

Submitted 2 February, 2024; originally announced February 2024.

Comments: The paper has been accepted by the XAI4Sci workshop of AAAI 2024

arXiv:2401.17842 [pdf, other]

Explainable Benchmarking for Iterative Optimization Heuristics

Authors: Niki van Stein, Diederick Vermetten, Anna V. Kononova, Thomas Bäck

Abstract: Benchmarking heuristic algorithms is vital to understand under which conditions and on what kind of problems certain algorithms perform well. In most current research into heuristic optimization algorithms, only a very limited number of scenarios, algorithm configurations and hyper-parameter settings are explored, leading to incomplete and often biased insights and results. This paper presents a n… ▽ More Benchmarking heuristic algorithms is vital to understand under which conditions and on what kind of problems certain algorithms perform well. In most current research into heuristic optimization algorithms, only a very limited number of scenarios, algorithm configurations and hyper-parameter settings are explored, leading to incomplete and often biased insights and results. This paper presents a novel approach we call explainable benchmarking. Introducing the IOH-Xplainer software framework, for analyzing and understanding the performance of various optimization algorithms and the impact of their different components and hyper-parameters. We showcase the framework in the context of two modular optimization frameworks. Through this framework, we examine the impact of different algorithmic components and configurations, offering insights into their performance across diverse scenarios. We provide a systematic method for evaluating and interpreting the behaviour and efficiency of iterative optimization heuristics in a more transparent and comprehensible manner, allowing for better benchmarking and algorithm design. △ Less

Submitted 23 February, 2024; v1 submitted 31 January, 2024; originally announced January 2024.

Comments: Submitted to ACM TELO

arXiv:2309.12078 [pdf, other]

Clustering-based Domain-Incremental Learning

Authors: Christiaan Lamers, Rene Vidal, Nabil Belbachir, Niki van Stein, Thomas Baeck, Paris Giampouras

Abstract: We consider the problem of learning multiple tasks in a continual learning setting in which data from different tasks is presented to the learner in a streaming fashion. A key challenge in this setting is the so-called "catastrophic forgetting problem", in which the performance of the learner in an "old task" decreases when subsequently trained on a "new task". Existing continual learning methods,… ▽ More We consider the problem of learning multiple tasks in a continual learning setting in which data from different tasks is presented to the learner in a streaming fashion. A key challenge in this setting is the so-called "catastrophic forgetting problem", in which the performance of the learner in an "old task" decreases when subsequently trained on a "new task". Existing continual learning methods, such as Averaged Gradient Episodic Memory (A-GEM) and Orthogonal Gradient Descent (OGD), address catastrophic forgetting by minimizing the loss for the current task without increasing the loss for previous tasks. However, these methods assume the learner knows when the task changes, which is unrealistic in practice. In this paper, we alleviate the need to provide the algorithm with information about task changes by using an online clustering-based approach on a dynamically updated finite pool of samples or gradients. We thereby successfully counteract catastrophic forgetting in one of the hardest settings, namely: domain-incremental learning, a setting for which the problem was previously unsolved. We showcase the benefits of our approach by applying these ideas to projection-based methods, such as A-GEM and OGD, which lead to task-agnostic versions of them. Experiments on real datasets demonstrate the effectiveness of the proposed strategy and its promising performance compared to state-of-the-art methods. △ Less

Submitted 21 September, 2023; originally announced September 2023.

arXiv:2306.02985 [pdf, other]

Representation-agnostic distance-driven perturbation for optimizing ill-conditioned problems

Authors: Kirill Antonov, Anna V. Kononova, Thomas Bäck, Niki van Stein

Abstract: Locality is a crucial property for efficiently optimising black-box problems with randomized search heuristics. However, in practical applications, it is not likely to always find such a genotype encoding of candidate solutions that this property is upheld with respect to the Hamming distance. At the same time, it may be possible to use domain-specific knowledge to define a metric with locality pr… ▽ More Locality is a crucial property for efficiently optimising black-box problems with randomized search heuristics. However, in practical applications, it is not likely to always find such a genotype encoding of candidate solutions that this property is upheld with respect to the Hamming distance. At the same time, it may be possible to use domain-specific knowledge to define a metric with locality property. We propose two mutation operators to solve such optimization problems more efficiently using the metric. The first operator assumes prior knowledge about the distance, the second operator uses the distance as a black box. Those operators apply an estimation of distribution algorithm to find the best mutant according to the defined in the paper function, which employs the given distance. For pseudo-boolean and integer optimization problems, we experimentally show that both mutation operators speed up the search on most of the functions when applied in considered evolutionary algorithms and random local search. Moreover, those operators can be applied in any randomized search heuristic which uses perturbations. However, our mutation operators increase wall-clock time and so are helpful in practice when distance is (much) cheaper to compute than the real objective function. △ Less

Submitted 5 June, 2023; originally announced June 2023.

Comments: 10 pages, 2 figures

arXiv:2305.15245 [pdf, other]

Challenges of ELA-guided Function Evolution using Genetic Programming

Authors: Fu Xing Long, Diederick Vermetten, Anna V. Kononova, Roman Kalkreuth, Kaifeng Yang, Thomas Bäck, Niki van Stein

Abstract: Within the optimization community, the question of how to generate new optimization problems has been gaining traction in recent years. Within topics such as instance space analysis (ISA), the generation of new problems can provide new benchmarks which are not yet explored in existing research. Beyond that, this function generation can also be exploited for solving complex real-world optimization… ▽ More Within the optimization community, the question of how to generate new optimization problems has been gaining traction in recent years. Within topics such as instance space analysis (ISA), the generation of new problems can provide new benchmarks which are not yet explored in existing research. Beyond that, this function generation can also be exploited for solving complex real-world optimization problems. By generating functions with similar properties to the target problem, we can create a robust test set for algorithm selection and configuration. However, the generation of functions with specific target properties remains challenging. While features exist to capture low-level landscape properties, they might not always capture the intended high-level features. We show that a genetic programming (GP) approach guided by these exploratory landscape analysis (ELA) properties is not always able to find satisfying functions. Our results suggest that careful considerations of the weighting of landscape properties, as well as the distance measure used, might be required to evolve functions that are sufficiently representative to the target landscape. △ Less

Submitted 24 May, 2023; originally announced May 2023.

Showing 1–14 of 14 results for author: van Stein, N