Search | arXiv e-print repository

doi 10.1145/3638529.3654017

LLMatic: Neural Architecture Search via Large Language Models and Quality Diversity Optimization

Authors: Muhammad U. Nasir, Sam Earle, Christopher Cleghorn, Steven James, Julian Togelius

Abstract: Large Language Models (LLMs) have emerged as powerful tools capable of accomplishing a broad spectrum of tasks. Their abilities span numerous areas, and one area where they have made a significant impact is in the domain of code generation. Here, we propose using the coding abilities of LLMs to introduce meaningful variations to code defining neural networks. Meanwhile, Quality-Diversity (QD) algo… ▽ More Large Language Models (LLMs) have emerged as powerful tools capable of accomplishing a broad spectrum of tasks. Their abilities span numerous areas, and one area where they have made a significant impact is in the domain of code generation. Here, we propose using the coding abilities of LLMs to introduce meaningful variations to code defining neural networks. Meanwhile, Quality-Diversity (QD) algorithms are known to discover diverse and robust solutions. By merging the code-generating abilities of LLMs with the diversity and robustness of QD solutions, we introduce \texttt{LLMatic}, a Neural Architecture Search (NAS) algorithm. While LLMs struggle to conduct NAS directly through prompts, \texttt{LLMatic} uses a procedural approach, leveraging QD for prompts and network architecture to create diverse and high-performing networks. We test \texttt{LLMatic} on the CIFAR-10 and NAS-bench-201 benchmarks, demonstrating that it can produce competitive networks while evaluating just $2,000$ candidates, even without prior knowledge of the benchmark domain or exposure to any previous top-performing models for the benchmark. The open-sourced code is available in \url{https://github.com/umair-nasir14/LLMatic}. △ Less

Submitted 12 April, 2024; v1 submitted 1 June, 2023; originally announced June 2023.

Comments: Accepted to The Genetic and Evolutionary Computation Conference 2024

arXiv:2303.16912 [pdf, other]

doi 10.1016/j.ins.2024.121363

Training Feedforward Neural Networks with Bayesian Hyper-Heuristics

Authors: Arné Schreuder, Anna Bosman, Andries Engelbrecht, Christopher Cleghorn

Abstract: The process of training feedforward neural networks (FFNNs) can benefit from an automated process where the best heuristic to train the network is sought out automatically by means of a high-level probabilistic-based heuristic. This research introduces a novel population-based Bayesian hyper-heuristic (BHH) that is used to train feedforward neural networks (FFNNs). The performance of the BHH is co… ▽ More The process of training feedforward neural networks (FFNNs) can benefit from an automated process where the best heuristic to train the network is sought out automatically by means of a high-level probabilistic-based heuristic. This research introduces a novel population-based Bayesian hyper-heuristic (BHH) that is used to train feedforward neural networks (FFNNs). The performance of the BHH is compared to that of ten popular low-level heuristics, each with different search behaviours. The chosen heuristic pool consists of classic gradient-based heuristics as well as meta-heuristics (MHs). The empirical process is executed on fourteen datasets consisting of classification and regression problems with varying characteristics. The BHH is shown to be able to train FFNNs well and provide an automated method for finding the best heuristic to train the FFNNs at various stages of the training process. △ Less

Submitted 29 March, 2023; originally announced March 2023.

arXiv:2210.11442 [pdf, other]

Augmentative Topology Agents For Open-Ended Learning

Authors: Muhammad Umair Nasir, Michael Beukman, Steven James, Christopher Wesley Cleghorn

Abstract: In this work, we tackle the problem of open-ended learning by introducing a method that simultaneously evolves agents and increasingly challenging environments. Unlike previous open-ended approaches that optimize agents using a fixed neural network topology, we hypothesize that generalization can be improved by allowing agents' controllers to become more complex as they encounter more difficult en… ▽ More In this work, we tackle the problem of open-ended learning by introducing a method that simultaneously evolves agents and increasingly challenging environments. Unlike previous open-ended approaches that optimize agents using a fixed neural network topology, we hypothesize that generalization can be improved by allowing agents' controllers to become more complex as they encounter more difficult environments. Our method, Augmentative Topology EPOET (ATEP), extends the Enhanced Paired Open-Ended Trailblazer (EPOET) algorithm by allowing agents to evolve their own neural network structures over time, adding complexity and capacity as necessary. Empirical results demonstrate that ATEP results in general agents capable of solving more environments than a fixed-topology baseline. We also investigate mechanisms for transferring agents between environments and find that a species-based approach further improves the performance and generalization of agents. △ Less

Submitted 11 October, 2023; v1 submitted 20 October, 2022; originally announced October 2022.

Comments: Accepted to The Proceedings of Genetic and Evolutionary Computation Conference (GECCO) 2023

arXiv:2206.06903 [pdf, other]

A Local Optima Network Analysis of the Feedforward Neural Architecture Space

Authors: Isak Potgieter, Christopher W. Cleghorn, Anna S. Bosman

Abstract: This study investigates the use of local optima network (LON) analysis, a derivative of the fitness landscape of candidate solutions, to characterise and visualise the neural architecture space. The search space of feedforward neural network architectures with up to three layers, each with up to 10 neurons, is fully enumerated by evaluating trained model performance on a selection of data sets. Ex… ▽ More This study investigates the use of local optima network (LON) analysis, a derivative of the fitness landscape of candidate solutions, to characterise and visualise the neural architecture space. The search space of feedforward neural network architectures with up to three layers, each with up to 10 neurons, is fully enumerated by evaluating trained model performance on a selection of data sets. Extracted LONs, while heterogeneous across data sets, all exhibit simple global structures, with single global funnels in all cases but one. These results yield early indication that LONs may provide a viable paradigm by which to analyse and optimise neural architectures. △ Less

Submitted 2 June, 2022; originally announced June 2022.

Comments: A version of this paper has been accepted for publication at IJCNN'22

arXiv:2204.06934 [pdf, other]

doi 10.1145/3512290.3528701

Procedural Content Generation using Neuroevolution and Novelty Search for Diverse Video Game Levels

Authors: Michael Beukman, Christopher W Cleghorn, Steven James

Abstract: Procedurally generated video game content has the potential to drastically reduce the content creation budget of game developers and large studios. However, adoption is hindered by limitations such as slow generation, as well as low quality and diversity of content. We introduce an evolutionary search-based approach for evolving level generators using novelty search to procedurally generate divers… ▽ More Procedurally generated video game content has the potential to drastically reduce the content creation budget of game developers and large studios. However, adoption is hindered by limitations such as slow generation, as well as low quality and diversity of content. We introduce an evolutionary search-based approach for evolving level generators using novelty search to procedurally generate diverse levels in real time, without requiring training data or detailed domain-specific knowledge. We test our method on two domains, and our results show an order of magnitude speedup in generation time compared to existing methods while obtaining comparable metric scores. We further demonstrate the ability to generalise to arbitrary-sized levels without retraining. △ Less

Submitted 14 April, 2022; originally announced April 2022.

Comments: Accepted to the Genetic and Evolutionary Computation Conference (GECCO '22), July 9--13, 2022, Boston, MA, USA. Code is located at https://github.com/Michael-Beukman/PCGNN

arXiv:2201.10334 [pdf, ps, other]

Towards Objective Metrics for Procedurally Generated Video Game Levels

Authors: Michael Beukman, Steven James, Christopher Cleghorn

Abstract: With increasing interest in procedural content generation by academia and game developers alike, it is vital that different approaches can be compared fairly. However, evaluating procedurally generated video game levels is often difficult, due to the lack of standardised, game-independent metrics. In this paper, we introduce two simulation-based evaluation metrics that involve analysing the behavi… ▽ More With increasing interest in procedural content generation by academia and game developers alike, it is vital that different approaches can be compared fairly. However, evaluating procedurally generated video game levels is often difficult, due to the lack of standardised, game-independent metrics. In this paper, we introduce two simulation-based evaluation metrics that involve analysing the behaviour of an A* agent to measure the diversity and difficulty of generated levels in a general, game-independent manner. Diversity is calculated by comparing action trajectories from different levels using the edit distance, and difficulty is measured as how much exploration and expansion of the A* search tree is necessary before the agent can solve the level. We demonstrate that our diversity metric is more robust to changes in level size and representation than current methods and additionally measures factors that directly affect playability, instead of focusing on visual information. The difficulty metric shows promise, as it correlates with existing estimates of difficulty in one of the tested domains, but it does face some challenges in the other domain. Finally, to promote reproducibility, we publicly release our evaluation framework. △ Less

Submitted 9 March, 2022; v1 submitted 25 January, 2022; originally announced January 2022.

Comments: 7 pages, 10 figures. V3: This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible. Code is located at https://github.com/Michael-Beukman/PCGNN

arXiv:2109.05606 [pdf, other]

doi 10.1007/978-3-031-14714-2_13

A Continuous Optimisation Benchmark Suite from Neural Network Regression

Authors: Katherine M. Malan, Christopher W. Cleghorn

Abstract: Designing optimisation algorithms that perform well in general requires experimentation on a range of diverse problems. Training neural networks is an optimisation task that has gained prominence with the recent successes of deep learning. Although evolutionary algorithms have been used for training neural networks, gradient descent variants are by far the most common choice with their trusted goo… ▽ More Designing optimisation algorithms that perform well in general requires experimentation on a range of diverse problems. Training neural networks is an optimisation task that has gained prominence with the recent successes of deep learning. Although evolutionary algorithms have been used for training neural networks, gradient descent variants are by far the most common choice with their trusted good performance on large-scale machine learning tasks. With this paper we contribute CORNN (Continuous Optimisation of Regression tasks using Neural Networks), a large suite for benchmarking the performance of any continuous black-box algorithm on neural network training problems. Using a range of regression problems and neural network architectures, problem instances with different dimensions and levels of difficulty can be created. We demonstrate the use of the CORNN Suite by comparing the performance of three evolutionary and swarm-based algorithms on over 300 problem instances, showing evidence of performance complementarity between the algorithms. As a baseline, the performance of the best population-based algorithm is benchmarked against a gradient-based approach. The CORNN suite is shared as a public web repository to facilitate easy integration with existing benchmarking platforms. △ Less

Submitted 3 September, 2022; v1 submitted 12 September, 2021; originally announced September 2021.

Comments: This preprint has not undergone peer review or any post-submission improvements or corrections. The Version of Record of this contribution is published in Parallel Problem Solving from Nature - PPSN XVII, Lecture Notes in Computer Science, Vol 13398, and is available online at https://doi.org/10.1007/978-3-031-14714-2_13

Journal ref: In: Parallel Problem Solving from Nature - PPSN XVII. PPSN 2022. Lecture Notes in Computer Science, vol 13398. Springer, Cham (2022)

arXiv:2008.02093 [pdf, other]

doi 10.1109/SSCI50451.2021.9660085

Point Proposal Network: Accelerating Point Source Detection Through Deep Learning

Authors: Duncan Tilley, Christopher W. Cleghorn, Kshitij Thorat, Roger Deane

Abstract: Point source detection techniques are used to identify and localise point sources in radio astronomical surveys. With the development of the Square Kilometre Array (SKA) telescope, survey images will see a massive increase in size from Gigapixels to Terapixels. Point source detection has already proven to be a challenge in recent surveys performed by SKA pathfinder telescopes. This paper proposes… ▽ More Point source detection techniques are used to identify and localise point sources in radio astronomical surveys. With the development of the Square Kilometre Array (SKA) telescope, survey images will see a massive increase in size from Gigapixels to Terapixels. Point source detection has already proven to be a challenge in recent surveys performed by SKA pathfinder telescopes. This paper proposes the Point Proposal Network (PPN): a point source detector that utilises deep convolutional neural networks for fast source detection. Results measured on simulated MeerKAT images show that, although less precise when compared to leading alternative approaches, PPN performs source detection faster and is able to scale to large images, unlike the alternative approaches. △ Less

Submitted 4 February, 2021; v1 submitted 5 August, 2020; originally announced August 2020.

Journal ref: 2021 IEEE Symposium Series on Computational Intelligence (SSCI), 2021, pp. 1-8

arXiv:2004.00476 [pdf, ps, other]

Particle Swarm Optimization: Stability Analysis using N-Informers under Arbitrary Coefficient Distributions

Authors: Christopher W Cleghorn, Belinda Stapelberg

Abstract: This paper derives, under minimal modelling assumptions, a simple to use theorem for obtaining both order-$1$ and order-$2$ stability criteria for a common class of particle swarm optimization (PSO) variants. Specifically, PSO variants that can be rewritten as a finite sum of stochastically weighted difference vectors between a particle's position and swarm informers are covered by the theorem. Ad… ▽ More This paper derives, under minimal modelling assumptions, a simple to use theorem for obtaining both order-$1$ and order-$2$ stability criteria for a common class of particle swarm optimization (PSO) variants. Specifically, PSO variants that can be rewritten as a finite sum of stochastically weighted difference vectors between a particle's position and swarm informers are covered by the theorem. Additionally, the use of the derived theorem allows a PSO practitioner to obtain stability criteria that contains no artificial restriction on the relationship between control coefficients. Almost all previous PSO stability results have provided stability criteria under the restriction that the social and cognitive control coefficients are equal; such restrictions are not present when using the derived theorem. Using the derived theorem, as demonstration of its ease of use, stability criteria are derived without the imposed restriction on the relation between the control coefficients for three popular PSO variants. △ Less

Submitted 1 April, 2020; originally announced April 2020.

arXiv:1909.13179 [pdf]

doi 10.1371/journal.pone.0230506

Modelling the health impact of food taxes and subsidies with price elasticities: the case for additional scaling of food consumption using the total food expenditure elasticity

Authors: Tony Blakely, Nhung Nghiem, Murat Genc, Anja Mizdrak, Linda Cobiac, Cliona Ni Mhurchu, Boyd Swinburn, Peter Scarborough, Christine Cleghorn

Abstract: Background Food taxes and subsidies are one intervention to address poor diets. Price elasticity (PE) matrices are commonly used to model the change in food purchasing. Usually a PE matrix is generated in one setting then applied to another setting with differing starting consumption and prices of foods. This violates econometric assumptions resulting in likely misestimation of total food consumpt… ▽ More Background Food taxes and subsidies are one intervention to address poor diets. Price elasticity (PE) matrices are commonly used to model the change in food purchasing. Usually a PE matrix is generated in one setting then applied to another setting with differing starting consumption and prices of foods. This violates econometric assumptions resulting in likely misestimation of total food consumption. We illustrate rescaling all consumption after applying a PE matrix using a total food expenditure elasticity (TFEe, the expenditure elasticity for all food combined given the policy induced change in the total price of food). We use case studies of NZ$2 per 100g saturated fat (SAFA) tax, NZ$0.4 per 100g sugar tax, and a 20% fruit and vegetable (F&V) subsidy. Methods We estimated changes in food purchasing using a NZ PE matrix applied conventionally, then with TFEe adjustment. Impacts were quantified for total food expenditure and health adjusted life years (HALYs) for the total NZ population alive in 2011 over the rest of their lifetime using a multistate lifetable model. Results Two NZ studies gave TFEes of 0.68 and 0.83, with international estimates ranging from 0.46 to 0.90. Without TFEe adjustment, total food expenditure decreased with the tax policies and increased with the F&V subsidy, implausible directions of shift given economic theory. After TFEe adjustment, HALY gains reduced by a third to a half for the two taxes and reversed from an apparent health loss to a health gain for the F&V subsidy. With TFEe adjustment, HALY gains (in 1000s) were 1,805 (95% uncertainty interval 1,337 to 2,340) for the SAFA tax, 1,671 (1,220 to 2,269) for the sugar tax, and 953 (453 to 1,308) for the F&V subsidy. Conclusions If PE matrices are applied in settings beyond where they were derived, additional scaling is likely required. We suggest that the TFEe is a useful scalar. △ Less

Submitted 28 September, 2019; originally announced September 2019.

Showing 1–10 of 10 results for author: Cleghorn, C