-
LLMatic: Neural Architecture Search via Large Language Models and Quality Diversity Optimization
Authors:
Muhammad U. Nasir,
Sam Earle,
Christopher Cleghorn,
Steven James,
Julian Togelius
Abstract:
Large Language Models (LLMs) have emerged as powerful tools capable of accomplishing a broad spectrum of tasks. Their abilities span numerous areas, and one area where they have made a significant impact is in the domain of code generation. Here, we propose using the coding abilities of LLMs to introduce meaningful variations to code defining neural networks. Meanwhile, Quality-Diversity (QD) algo…
▽ More
Large Language Models (LLMs) have emerged as powerful tools capable of accomplishing a broad spectrum of tasks. Their abilities span numerous areas, and one area where they have made a significant impact is in the domain of code generation. Here, we propose using the coding abilities of LLMs to introduce meaningful variations to code defining neural networks. Meanwhile, Quality-Diversity (QD) algorithms are known to discover diverse and robust solutions. By merging the code-generating abilities of LLMs with the diversity and robustness of QD solutions, we introduce \texttt{LLMatic}, a Neural Architecture Search (NAS) algorithm. While LLMs struggle to conduct NAS directly through prompts, \texttt{LLMatic} uses a procedural approach, leveraging QD for prompts and network architecture to create diverse and high-performing networks. We test \texttt{LLMatic} on the CIFAR-10 and NAS-bench-201 benchmarks, demonstrating that it can produce competitive networks while evaluating just $2,000$ candidates, even without prior knowledge of the benchmark domain or exposure to any previous top-performing models for the benchmark. The open-sourced code is available in \url{https://github.com/umair-nasir14/LLMatic}.
△ Less
Submitted 12 April, 2024; v1 submitted 1 June, 2023;
originally announced June 2023.
-
Training Feedforward Neural Networks with Bayesian Hyper-Heuristics
Authors:
Arné Schreuder,
Anna Bosman,
Andries Engelbrecht,
Christopher Cleghorn
Abstract:
The process of training feedforward neural networks (FFNNs) can benefit from an automated process where the best heuristic to train the network is sought out automatically by means of a high-level probabilistic-based heuristic. This research introduces a novel population-based Bayesian hyper-heuristic (BHH) that is used to train feedforward neural networks (FFNNs). The performance of the BHH is co…
▽ More
The process of training feedforward neural networks (FFNNs) can benefit from an automated process where the best heuristic to train the network is sought out automatically by means of a high-level probabilistic-based heuristic. This research introduces a novel population-based Bayesian hyper-heuristic (BHH) that is used to train feedforward neural networks (FFNNs). The performance of the BHH is compared to that of ten popular low-level heuristics, each with different search behaviours. The chosen heuristic pool consists of classic gradient-based heuristics as well as meta-heuristics (MHs). The empirical process is executed on fourteen datasets consisting of classification and regression problems with varying characteristics. The BHH is shown to be able to train FFNNs well and provide an automated method for finding the best heuristic to train the FFNNs at various stages of the training process.
△ Less
Submitted 29 March, 2023;
originally announced March 2023.
-
Augmentative Topology Agents For Open-Ended Learning
Authors:
Muhammad Umair Nasir,
Michael Beukman,
Steven James,
Christopher Wesley Cleghorn
Abstract:
In this work, we tackle the problem of open-ended learning by introducing a method that simultaneously evolves agents and increasingly challenging environments. Unlike previous open-ended approaches that optimize agents using a fixed neural network topology, we hypothesize that generalization can be improved by allowing agents' controllers to become more complex as they encounter more difficult en…
▽ More
In this work, we tackle the problem of open-ended learning by introducing a method that simultaneously evolves agents and increasingly challenging environments. Unlike previous open-ended approaches that optimize agents using a fixed neural network topology, we hypothesize that generalization can be improved by allowing agents' controllers to become more complex as they encounter more difficult environments. Our method, Augmentative Topology EPOET (ATEP), extends the Enhanced Paired Open-Ended Trailblazer (EPOET) algorithm by allowing agents to evolve their own neural network structures over time, adding complexity and capacity as necessary. Empirical results demonstrate that ATEP results in general agents capable of solving more environments than a fixed-topology baseline. We also investigate mechanisms for transferring agents between environments and find that a species-based approach further improves the performance and generalization of agents.
△ Less
Submitted 11 October, 2023; v1 submitted 20 October, 2022;
originally announced October 2022.
-
A Local Optima Network Analysis of the Feedforward Neural Architecture Space
Authors:
Isak Potgieter,
Christopher W. Cleghorn,
Anna S. Bosman
Abstract:
This study investigates the use of local optima network (LON) analysis, a derivative of the fitness landscape of candidate solutions, to characterise and visualise the neural architecture space. The search space of feedforward neural network architectures with up to three layers, each with up to 10 neurons, is fully enumerated by evaluating trained model performance on a selection of data sets. Ex…
▽ More
This study investigates the use of local optima network (LON) analysis, a derivative of the fitness landscape of candidate solutions, to characterise and visualise the neural architecture space. The search space of feedforward neural network architectures with up to three layers, each with up to 10 neurons, is fully enumerated by evaluating trained model performance on a selection of data sets. Extracted LONs, while heterogeneous across data sets, all exhibit simple global structures, with single global funnels in all cases but one. These results yield early indication that LONs may provide a viable paradigm by which to analyse and optimise neural architectures.
△ Less
Submitted 2 June, 2022;
originally announced June 2022.
-
Procedural Content Generation using Neuroevolution and Novelty Search for Diverse Video Game Levels
Authors:
Michael Beukman,
Christopher W Cleghorn,
Steven James
Abstract:
Procedurally generated video game content has the potential to drastically reduce the content creation budget of game developers and large studios. However, adoption is hindered by limitations such as slow generation, as well as low quality and diversity of content. We introduce an evolutionary search-based approach for evolving level generators using novelty search to procedurally generate divers…
▽ More
Procedurally generated video game content has the potential to drastically reduce the content creation budget of game developers and large studios. However, adoption is hindered by limitations such as slow generation, as well as low quality and diversity of content. We introduce an evolutionary search-based approach for evolving level generators using novelty search to procedurally generate diverse levels in real time, without requiring training data or detailed domain-specific knowledge. We test our method on two domains, and our results show an order of magnitude speedup in generation time compared to existing methods while obtaining comparable metric scores. We further demonstrate the ability to generalise to arbitrary-sized levels without retraining.
△ Less
Submitted 14 April, 2022;
originally announced April 2022.
-
Towards Objective Metrics for Procedurally Generated Video Game Levels
Authors:
Michael Beukman,
Steven James,
Christopher Cleghorn
Abstract:
With increasing interest in procedural content generation by academia and game developers alike, it is vital that different approaches can be compared fairly. However, evaluating procedurally generated video game levels is often difficult, due to the lack of standardised, game-independent metrics. In this paper, we introduce two simulation-based evaluation metrics that involve analysing the behavi…
▽ More
With increasing interest in procedural content generation by academia and game developers alike, it is vital that different approaches can be compared fairly. However, evaluating procedurally generated video game levels is often difficult, due to the lack of standardised, game-independent metrics. In this paper, we introduce two simulation-based evaluation metrics that involve analysing the behaviour of an A* agent to measure the diversity and difficulty of generated levels in a general, game-independent manner. Diversity is calculated by comparing action trajectories from different levels using the edit distance, and difficulty is measured as how much exploration and expansion of the A* search tree is necessary before the agent can solve the level. We demonstrate that our diversity metric is more robust to changes in level size and representation than current methods and additionally measures factors that directly affect playability, instead of focusing on visual information. The difficulty metric shows promise, as it correlates with existing estimates of difficulty in one of the tested domains, but it does face some challenges in the other domain. Finally, to promote reproducibility, we publicly release our evaluation framework.
△ Less
Submitted 9 March, 2022; v1 submitted 25 January, 2022;
originally announced January 2022.
-
A Continuous Optimisation Benchmark Suite from Neural Network Regression
Authors:
Katherine M. Malan,
Christopher W. Cleghorn
Abstract:
Designing optimisation algorithms that perform well in general requires experimentation on a range of diverse problems. Training neural networks is an optimisation task that has gained prominence with the recent successes of deep learning. Although evolutionary algorithms have been used for training neural networks, gradient descent variants are by far the most common choice with their trusted goo…
▽ More
Designing optimisation algorithms that perform well in general requires experimentation on a range of diverse problems. Training neural networks is an optimisation task that has gained prominence with the recent successes of deep learning. Although evolutionary algorithms have been used for training neural networks, gradient descent variants are by far the most common choice with their trusted good performance on large-scale machine learning tasks. With this paper we contribute CORNN (Continuous Optimisation of Regression tasks using Neural Networks), a large suite for benchmarking the performance of any continuous black-box algorithm on neural network training problems. Using a range of regression problems and neural network architectures, problem instances with different dimensions and levels of difficulty can be created. We demonstrate the use of the CORNN Suite by comparing the performance of three evolutionary and swarm-based algorithms on over 300 problem instances, showing evidence of performance complementarity between the algorithms. As a baseline, the performance of the best population-based algorithm is benchmarked against a gradient-based approach. The CORNN suite is shared as a public web repository to facilitate easy integration with existing benchmarking platforms.
△ Less
Submitted 3 September, 2022; v1 submitted 12 September, 2021;
originally announced September 2021.
-
Point Proposal Network: Accelerating Point Source Detection Through Deep Learning
Authors:
Duncan Tilley,
Christopher W. Cleghorn,
Kshitij Thorat,
Roger Deane
Abstract:
Point source detection techniques are used to identify and localise point sources in radio astronomical surveys. With the development of the Square Kilometre Array (SKA) telescope, survey images will see a massive increase in size from Gigapixels to Terapixels. Point source detection has already proven to be a challenge in recent surveys performed by SKA pathfinder telescopes. This paper proposes…
▽ More
Point source detection techniques are used to identify and localise point sources in radio astronomical surveys. With the development of the Square Kilometre Array (SKA) telescope, survey images will see a massive increase in size from Gigapixels to Terapixels. Point source detection has already proven to be a challenge in recent surveys performed by SKA pathfinder telescopes. This paper proposes the Point Proposal Network (PPN): a point source detector that utilises deep convolutional neural networks for fast source detection. Results measured on simulated MeerKAT images show that, although less precise when compared to leading alternative approaches, PPN performs source detection faster and is able to scale to large images, unlike the alternative approaches.
△ Less
Submitted 4 February, 2021; v1 submitted 5 August, 2020;
originally announced August 2020.
-
Particle Swarm Optimization: Stability Analysis using N-Informers under Arbitrary Coefficient Distributions
Authors:
Christopher W Cleghorn,
Belinda Stapelberg
Abstract:
This paper derives, under minimal modelling assumptions, a simple to use theorem for obtaining both order-$1$ and order-$2$ stability criteria for a common class of particle swarm optimization (PSO) variants. Specifically, PSO variants that can be rewritten as a finite sum of stochastically weighted difference vectors between a particle's position and swarm informers are covered by the theorem. Ad…
▽ More
This paper derives, under minimal modelling assumptions, a simple to use theorem for obtaining both order-$1$ and order-$2$ stability criteria for a common class of particle swarm optimization (PSO) variants. Specifically, PSO variants that can be rewritten as a finite sum of stochastically weighted difference vectors between a particle's position and swarm informers are covered by the theorem. Additionally, the use of the derived theorem allows a PSO practitioner to obtain stability criteria that contains no artificial restriction on the relationship between control coefficients. Almost all previous PSO stability results have provided stability criteria under the restriction that the social and cognitive control coefficients are equal; such restrictions are not present when using the derived theorem. Using the derived theorem, as demonstration of its ease of use, stability criteria are derived without the imposed restriction on the relation between the control coefficients for three popular PSO variants.
△ Less
Submitted 1 April, 2020;
originally announced April 2020.
-
Modelling the health impact of food taxes and subsidies with price elasticities: the case for additional scaling of food consumption using the total food expenditure elasticity
Authors:
Tony Blakely,
Nhung Nghiem,
Murat Genc,
Anja Mizdrak,
Linda Cobiac,
Cliona Ni Mhurchu,
Boyd Swinburn,
Peter Scarborough,
Christine Cleghorn
Abstract:
Background Food taxes and subsidies are one intervention to address poor diets. Price elasticity (PE) matrices are commonly used to model the change in food purchasing. Usually a PE matrix is generated in one setting then applied to another setting with differing starting consumption and prices of foods. This violates econometric assumptions resulting in likely misestimation of total food consumpt…
▽ More
Background Food taxes and subsidies are one intervention to address poor diets. Price elasticity (PE) matrices are commonly used to model the change in food purchasing. Usually a PE matrix is generated in one setting then applied to another setting with differing starting consumption and prices of foods. This violates econometric assumptions resulting in likely misestimation of total food consumption. We illustrate rescaling all consumption after applying a PE matrix using a total food expenditure elasticity (TFEe, the expenditure elasticity for all food combined given the policy induced change in the total price of food). We use case studies of NZ$2 per 100g saturated fat (SAFA) tax, NZ$0.4 per 100g sugar tax, and a 20% fruit and vegetable (F&V) subsidy. Methods We estimated changes in food purchasing using a NZ PE matrix applied conventionally, then with TFEe adjustment. Impacts were quantified for total food expenditure and health adjusted life years (HALYs) for the total NZ population alive in 2011 over the rest of their lifetime using a multistate lifetable model. Results Two NZ studies gave TFEes of 0.68 and 0.83, with international estimates ranging from 0.46 to 0.90. Without TFEe adjustment, total food expenditure decreased with the tax policies and increased with the F&V subsidy, implausible directions of shift given economic theory. After TFEe adjustment, HALY gains reduced by a third to a half for the two taxes and reversed from an apparent health loss to a health gain for the F&V subsidy. With TFEe adjustment, HALY gains (in 1000s) were 1,805 (95% uncertainty interval 1,337 to 2,340) for the SAFA tax, 1,671 (1,220 to 2,269) for the sugar tax, and 953 (453 to 1,308) for the F&V subsidy. Conclusions If PE matrices are applied in settings beyond where they were derived, additional scaling is likely required. We suggest that the TFEe is a useful scalar.
△ Less
Submitted 28 September, 2019;
originally announced September 2019.