Search | arXiv e-print repository

Making New Connections: LLMs as Puzzle Generators for The New York Times' Connections Word Game

Authors: Tim Merino, Sam Earle, Ryan Sudhakaran, Shyam Sudhakaran, Julian Togelius

Abstract: The Connections puzzle is a word association game published daily by The New York Times (NYT). In this game, players are asked to find groups of four words that are connected by a common theme. While solving a given Connections puzzle requires both semantic knowledge and abstract reasoning, generating novel puzzles additionally requires a form of metacognition: generators must be able to accuratel… ▽ More The Connections puzzle is a word association game published daily by The New York Times (NYT). In this game, players are asked to find groups of four words that are connected by a common theme. While solving a given Connections puzzle requires both semantic knowledge and abstract reasoning, generating novel puzzles additionally requires a form of metacognition: generators must be able to accurately model the downstream reasoning of potential solvers. In this paper, we investigate the ability of the GPT family of Large Language Models (LLMs) to generate challenging and creative word games for human players. We start with an analysis of the word game Connections and the unique challenges it poses as a Procedural Content Generation (PCG) domain. We then propose a method for generating Connections puzzles using LLMs by adapting a Tree of Thoughts (ToT) prompting approach. We evaluate this method by conducting a user study, asking human players to compare AI-generated puzzles against published Connections puzzles. Our findings show that LLMs are capable puzzle creators, and can generate diverse sets of enjoyable, challenging, and creative Connections puzzles as judged by human users. △ Less

Submitted 15 July, 2024; originally announced July 2024.

arXiv:2407.09388 [pdf, other]

GAVEL: Generating Games Via Evolution and Language Models

Authors: Graham Todd, Alexander Padula, Matthew Stephenson, Éric Piette, Dennis J. N. J. Soemers, Julian Togelius

Abstract: Automatically generating novel and interesting games is a complex task. Challenges include representing game rules in a computationally workable form, searching through the large space of potential games under most such representations, and accurately evaluating the originality and quality of previously unseen games. Prior work in automated game generation has largely focused on relatively restric… ▽ More Automatically generating novel and interesting games is a complex task. Challenges include representing game rules in a computationally workable form, searching through the large space of potential games under most such representations, and accurately evaluating the originality and quality of previously unseen games. Prior work in automated game generation has largely focused on relatively restricted rule representations and relied on domain-specific heuristics. In this work, we explore the generation of novel games in the comparatively expansive Ludii game description language, which encodes the rules of over 1000 board games in a variety of styles and modes of play. We draw inspiration from recent advances in large language models and evolutionary computation in order to train a model that intelligently mutates and recombines games and mechanics expressed as code. We demonstrate both quantitatively and qualitatively that our approach is capable of generating new and interesting games, including in regions of the potential rules space not covered by existing games in the Ludii dataset. A sample of the generated games are available to play online through the Ludii portal. △ Less

Submitted 12 July, 2024; originally announced July 2024.

Comments: 9 pages, 4 figures, 4 pages appendices

arXiv:2407.04221 [pdf, other]

Autoverse: An Evolvable Game Langugage for Learning Robust Embodied Agents

Authors: Sam Earle, Julian Togelius

Abstract: We introduce Autoverse, an evolvable, domain-specific language for single-player 2D grid-based games, and demonstrate its use as a scalable training ground for Open-Ended Learning (OEL) algorithms. Autoverse uses cellular-automaton-like rewrite rules to describe game mechanics, allowing it to express various game environments (e.g. mazes, dungeons, sokoban puzzles) that are popular testbeds for Re… ▽ More We introduce Autoverse, an evolvable, domain-specific language for single-player 2D grid-based games, and demonstrate its use as a scalable training ground for Open-Ended Learning (OEL) algorithms. Autoverse uses cellular-automaton-like rewrite rules to describe game mechanics, allowing it to express various game environments (e.g. mazes, dungeons, sokoban puzzles) that are popular testbeds for Reinforcement Learning (RL) agents. Each rewrite rule can be expressed as a series of simple convolutions, allowing for environments to be parallelized on the GPU, thereby drastically accelerating RL training. Using Autoverse, we propose jump-starting open-ended learning by imitation learning from search. In such an approach, we first evolve Autoverse environments (their rules and initial map topology) to maximize the number of iterations required by greedy tree search to discover a new best solution, producing a curriculum of increasingly complex environments and playtraces. We then distill these expert playtraces into a neural-network-based policy using imitation learning. Finally, we use the learned policy as a starting point for open-ended RL, where new training environments are continually evolved to maximize the RL player agent's value function error (a proxy for its regret, or the learnability of generated environments), finding that this approach improves the performance and generality of resultant player agents. △ Less

Submitted 4 July, 2024; originally announced July 2024.

Comments: 9 pages, 4 figures

arXiv:2405.13242 [pdf, other]

Goals as Reward-Producing Programs

Authors: Guy Davidson, Graham Todd, Julian Togelius, Todd M. Gureckis, Brenden M. Lake

Abstract: People are remarkably capable of generating their own goals, beginning with child's play and continuing into adulthood. Despite considerable empirical and computational work on goals and goal-oriented behavior, models are still far from capturing the richness of everyday human goals. Here, we bridge this gap by collecting a dataset of human-generated playful goals, modeling them as reward-producin… ▽ More People are remarkably capable of generating their own goals, beginning with child's play and continuing into adulthood. Despite considerable empirical and computational work on goals and goal-oriented behavior, models are still far from capturing the richness of everyday human goals. Here, we bridge this gap by collecting a dataset of human-generated playful goals, modeling them as reward-producing programs, and generating novel human-like goals through program synthesis. Reward-producing programs capture the rich semantics of goals through symbolic operations that compose, add temporal constraints, and allow for program execution on behavioral traces to evaluate progress. To build a generative model of goals, we learn a fitness function over the infinite set of possible goal programs and sample novel goals with a quality-diversity algorithm. Human evaluators found that model-generated goals, when sampled from partitions of program space occupied by human examples, were indistinguishable from human-created games. We also discovered that our model's internal fitness scores predict games that are evaluated as more fun to play and more human-like. △ Less

Submitted 30 May, 2024; v1 submitted 21 May, 2024; originally announced May 2024.

Comments: Project website and goal program viewer: https://exps.gureckislab.org/guydav/goal_programs_viewer/main/

arXiv:2405.06686 [pdf, other]

Word2World: Generating Stories and Worlds through Large Language Models

Authors: Muhammad U. Nasir, Steven James, Julian Togelius

Abstract: Large Language Models (LLMs) have proven their worth across a diverse spectrum of disciplines. LLMs have shown great potential in Procedural Content Generation (PCG) as well, but directly generating a level through a pre-trained LLM is still challenging. This work introduces Word2World, a system that enables LLMs to procedurally design playable games through stories, without any task-specific fine… ▽ More Large Language Models (LLMs) have proven their worth across a diverse spectrum of disciplines. LLMs have shown great potential in Procedural Content Generation (PCG) as well, but directly generating a level through a pre-trained LLM is still challenging. This work introduces Word2World, a system that enables LLMs to procedurally design playable games through stories, without any task-specific fine-tuning. Word2World leverages the abilities of LLMs to create diverse content and extract information. Combining these abilities, LLMs can create a story for the game, design narrative, and place tiles in appropriate places to create coherent worlds and playable games. We test Word2World with different LLMs and perform a thorough ablation study to validate each step. We open-source the code at https://github.com/umair-nasir14/Word2World. △ Less

Submitted 6 May, 2024; originally announced May 2024.

arXiv:2404.15538 [pdf, other]

DreamCraft: Text-Guided Generation of Functional 3D Environments in Minecraft

Authors: Sam Earle, Filippos Kokkinos, Yuhe Nie, Julian Togelius, Roberta Raileanu

Abstract: Procedural Content Generation (PCG) algorithms enable the automatic generation of complex and diverse artifacts. However, they don't provide high-level control over the generated content and typically require domain expertise. In contrast, text-to-3D methods allow users to specify desired characteristics in natural language, offering a high amount of flexibility and expressivity. But unlike PCG, s… ▽ More Procedural Content Generation (PCG) algorithms enable the automatic generation of complex and diverse artifacts. However, they don't provide high-level control over the generated content and typically require domain expertise. In contrast, text-to-3D methods allow users to specify desired characteristics in natural language, offering a high amount of flexibility and expressivity. But unlike PCG, such approaches cannot guarantee functionality, which is crucial for certain applications like game design. In this paper, we present a method for generating functional 3D artifacts from free-form text prompts in the open-world game Minecraft. Our method, DreamCraft, trains quantized Neural Radiance Fields (NeRFs) to represent artifacts that, when viewed in-game, match given text descriptions. We find that DreamCraft produces more aligned in-game artifacts than a baseline that post-processes the output of an unconstrained NeRF. Thanks to the quantized representation of the environment, functional constraints can be integrated using specialized loss terms. We show how this can be leveraged to generate 3D structures that match a target distribution or obey certain adjacency rules over the block types. DreamCraft inherits a high degree of expressivity and controllability from the NeRF, while still being able to incorporate functional constraints through domain-specific objectives. △ Less

Submitted 23 April, 2024; originally announced April 2024.

Comments: 16 pages, 9 figures, accepted to Foundation of Digital Games 2024

arXiv:2404.11730 [pdf, other]

Missed Connections: Lateral Thinking Puzzles for Large Language Models

Authors: Graham Todd, Tim Merino, Sam Earle, Julian Togelius

Abstract: The Connections puzzle published each day by the New York Times tasks players with dividing a bank of sixteen words into four groups of four words that each relate to a common theme. Solving the puzzle requires both common linguistic knowledge (i.e. definitions and typical usage) as well as, in many cases, lateral or abstract thinking. This is because the four categories ascend in complexity, with… ▽ More The Connections puzzle published each day by the New York Times tasks players with dividing a bank of sixteen words into four groups of four words that each relate to a common theme. Solving the puzzle requires both common linguistic knowledge (i.e. definitions and typical usage) as well as, in many cases, lateral or abstract thinking. This is because the four categories ascend in complexity, with the most challenging category often requiring thinking about words in uncommon ways or as parts of larger phrases. We investigate the capacity for automated AI systems to play Connections and explore the game's potential as an automated benchmark for abstract reasoning and a way to measure the semantic information encoded by data-driven linguistic systems. In particular, we study both a sentence-embedding baseline and modern large language models (LLMs). We report their accuracy on the task, measure the impacts of chain-of-thought prompting, and discuss their failure modes. Overall, we find that the Connections task is challenging yet feasible, and a strong test-bed for future work. △ Less

Submitted 21 April, 2024; v1 submitted 17 April, 2024; originally announced April 2024.

Comments: 8 pages, 3 figures

arXiv:2403.12047 [pdf, other]

Alpha-wolves and Alpha-mammals: Exploring Dictionary Attacks on Iris Recognition Systems

Authors: Sudipta Banerjee, Anubhav Jain, Zehua Jiang, Nasir Memon, Julian Togelius, Arun Ross

Abstract: A dictionary attack in a biometric system entails the use of a small number of strategically generated images or templates to successfully match with a large number of identities, thereby compromising security. We focus on dictionary attacks at the template level, specifically the IrisCodes used in iris recognition systems. We present an hitherto unknown vulnerability wherein we mix IrisCodes usin… ▽ More A dictionary attack in a biometric system entails the use of a small number of strategically generated images or templates to successfully match with a large number of identities, thereby compromising security. We focus on dictionary attacks at the template level, specifically the IrisCodes used in iris recognition systems. We present an hitherto unknown vulnerability wherein we mix IrisCodes using simple bitwise operators to generate alpha-mixtures - alpha-wolves (combining a set of "wolf" samples) and alpha-mammals (combining a set of users selected via search optimization) that increase false matches. We evaluate this vulnerability using the IITD, CASIA-IrisV4-Thousand and Synthetic datasets, and observe that an alpha-wolf (from two wolves) can match upto 71 identities @FMR=0.001%, while an alpha-mammal (from two identities) can match upto 133 other identities @FMR=0.01% on the IITD dataset. △ Less

Submitted 20 November, 2023; originally announced March 2024.

Comments: 8 pages, 5 figures, 13 tables, Workshop on Manipulation, Adversarial, and Presentation Attacks in Biometrics, Winter Conference on Applications of Computer Vision

arXiv:2403.02610 [pdf, ps, other]

ChatGPT4PCG 2 Competition: Prompt Engineering for Science Birds Level Generation

Authors: Pittawat Taveekitworachai, Febri Abdullah, Mury F. Dewantoro, Yi Xia, Pratch Suntichaikul, Ruck Thawonmas, Julian Togelius, Jochen Renz

Abstract: This paper presents the second ChatGPT4PCG competition at the 2024 IEEE Conference on Games. In this edition of the competition, we follow the first edition, but make several improvements and changes. We introduce a new evaluation metric along with allowing a more flexible format for participants' submissions and making several improvements to the evaluation pipeline. Continuing from the first edi… ▽ More This paper presents the second ChatGPT4PCG competition at the 2024 IEEE Conference on Games. In this edition of the competition, we follow the first edition, but make several improvements and changes. We introduce a new evaluation metric along with allowing a more flexible format for participants' submissions and making several improvements to the evaluation pipeline. Continuing from the first edition, we aim to foster and explore the realm of prompt engineering (PE) for procedural content generation (PCG). While the first competition saw success, it was hindered by various limitations; we aim to mitigate these limitations in this edition. We introduce diversity as a new metric to discourage submissions aimed at producing repetitive structures. Furthermore, we allow submission of a Python program instead of a prompt text file for greater flexibility in implementing advanced PE approaches, which may require control flow, including conditions and iterations. We also make several improvements to the evaluation pipeline with a better classifier for similarity evaluation and better-performing function signatures. We thoroughly evaluate the effectiveness of the new metric and the improved classifier. Additionally, we perform an ablation study to select a function signature to instruct ChatGPT for level generation. Finally, we provide implementation examples of various PE techniques in Python and evaluate their preliminary performance. We hope this competition serves as a resource and platform for learning about PE and PCG in general. △ Less

Submitted 4 March, 2024; originally announced March 2024.

ACM Class: I.2.7; I.2.8

arXiv:2403.02454 [pdf, other]

The Ink Splotch Effect: A Case Study on ChatGPT as a Co-Creative Game Designer

Authors: Asad Anjum, Yuting Li, Noelle Law, M Charity, Julian Togelius

Abstract: This paper studies how large language models (LLMs) can act as effective, high-level creative collaborators and ``muses'' for game design. We model the design of this study after the exercises artists use by looking at amorphous ink splotches for creative inspiration. Our goal is to determine whether AI-assistance can improve, hinder, or provide an alternative quality to games when compared to the… ▽ More This paper studies how large language models (LLMs) can act as effective, high-level creative collaborators and ``muses'' for game design. We model the design of this study after the exercises artists use by looking at amorphous ink splotches for creative inspiration. Our goal is to determine whether AI-assistance can improve, hinder, or provide an alternative quality to games when compared to the creative intents implemented by human designers. The capabilities of LLMs as game designers are stress tested by placing it at the forefront of the decision making process. Three prototype games are designed across 3 different genres: (1) a minimalist base game, (2) a game with features and game feel elements added by a human game designer, and (3) a game with features and feel elements directly implemented from prompted outputs of the LLM, ChatGPT. A user study was conducted and participants were asked to blindly evaluate the quality and their preference of these games. We discuss both the development process of communicating creative intent to an AI chatbot and the synthesized open feedback of the participants. We use this data to determine both the benefits and shortcomings of AI in a more design-centric role. △ Less

Submitted 4 March, 2024; originally announced March 2024.

Comments: 12 pages

arXiv:2402.18659 [pdf, other]

Large Language Models and Games: A Survey and Roadmap

Authors: Roberto Gallotta, Graham Todd, Marvin Zammit, Sam Earle, Antonios Liapis, Julian Togelius, Georgios N. Yannakakis

Abstract: Recent years have seen an explosive increase in research on large language models (LLMs), and accompanying public engagement on the topic. While starting as a niche area within natural language processing, LLMs have shown remarkable potential across a broad range of applications and domains, including games. This paper surveys the current state of the art across the various applications of LLMs in… ▽ More Recent years have seen an explosive increase in research on large language models (LLMs), and accompanying public engagement on the topic. While starting as a niche area within natural language processing, LLMs have shown remarkable potential across a broad range of applications and domains, including games. This paper surveys the current state of the art across the various applications of LLMs in and for games, and identifies the different roles LLMs can take within a game. Importantly, we discuss underexplored areas and promising directions for future uses of LLMs in games and we reconcile the potential and limitations of LLMs within the games domain. As the first comprehensive survey and roadmap at the intersection of LLMs and games, we are hopeful that this paper will serve as the basis for groundbreaking research and innovation in this exciting new field. △ Less

Submitted 15 July, 2024; v1 submitted 28 February, 2024; originally announced February 2024.

Comments: 18 pages, 6 figures

arXiv:2312.02231 [pdf, other]

Quality Diversity in the Amorphous Fortress (QD-AF): Evolving for Complexity in 0-Player Games

Authors: Sam Earle, M Charity, Dipika Rajesh, Mayu Wilson, Julian Togelius

Abstract: We explore the generation of diverse environments using the Amorphous Fortress (AF) simulation framework. AF defines a set of Finite State Machine (FSM) nodes and edges that can be recombined to control the behavior of agents in the `fortress' grid-world. The behaviors and conditions of the agents within the framework are designed to capture the common building blocks of multi-agent artificial lif… ▽ More We explore the generation of diverse environments using the Amorphous Fortress (AF) simulation framework. AF defines a set of Finite State Machine (FSM) nodes and edges that can be recombined to control the behavior of agents in the `fortress' grid-world. The behaviors and conditions of the agents within the framework are designed to capture the common building blocks of multi-agent artificial life and reinforcement learning environments. Using quality diversity evolutionary search, we generate diverse sets of environments. These environments exhibit certain types of complexity according to measures of agents' FSM architectures and activations, and collective behaviors. Our approach, Quality Diversity in Amorphous Fortress (QD-AF) generates families of 0-player games akin to simplistic ecological models, and we identify the emergence of both competitive and co-operative multi-agent and multi-species survival dynamics. We argue that these generated worlds can collectively serve as training and testing grounds for learning algorithms. △ Less

Submitted 4 December, 2023; originally announced December 2023.

Comments: 18 pages, 7 figures, ALOE workship at NeurIPS 2023

arXiv:2311.16172 [pdf, other]

Evolutionary Machine Learning and Games

Authors: Julian Togelius, Ahmed Khalifa, Sam Earle, Michael Cerny Green, Lisa Soros

Abstract: Evolutionary machine learning (EML) has been applied to games in multiple ways, and for multiple different purposes. Importantly, AI research in games is not only about playing games; it is also about generating game content, modeling players, and many other applications. Many of these applications pose interesting problems for EML. We will structure this chapter on EML for games based on whether… ▽ More Evolutionary machine learning (EML) has been applied to games in multiple ways, and for multiple different purposes. Importantly, AI research in games is not only about playing games; it is also about generating game content, modeling players, and many other applications. Many of these applications pose interesting problems for EML. We will structure this chapter on EML for games based on whether evolution is used to augment machine learning (ML) or ML is used to augment evolution. For completeness, we also briefly discuss the usage of ML and evolution separately in games. △ Less

Submitted 20 November, 2023; originally announced November 2023.

Comments: 27 pages, 5 figures, part of Evolutionary Machine Learning Book (https://link.springer.com/book/10.1007/978-981-99-3814-8)

arXiv:2311.03707 [pdf, other]

The NeurIPS 2022 Neural MMO Challenge: A Massively Multiagent Competition with Specialization and Trade

Authors: Enhong Liu, Joseph Suarez, Chenhui You, Bo Wu, Bingcheng Chen, Jun Hu, Jiaxin Chen, Xiaolong Zhu, Clare Zhu, Julian Togelius, Sharada Mohanty, Weijun Hong, Rui Du, Yibing Zhang, Qinwen Wang, Xinhang Li, Zheng Yuan, Xiang Li, Yuejia Huang, Kun Zhang, Hanhui Yang, Shiqi Tang, Phillip Isola

Abstract: In this paper, we present the results of the NeurIPS-2022 Neural MMO Challenge, which attracted 500 participants and received over 1,600 submissions. Like the previous IJCAI-2022 Neural MMO Challenge, it involved agents from 16 populations surviving in procedurally generated worlds by collecting resources and defeating opponents. This year's competition runs on the latest v1.6 Neural MMO, which in… ▽ More In this paper, we present the results of the NeurIPS-2022 Neural MMO Challenge, which attracted 500 participants and received over 1,600 submissions. Like the previous IJCAI-2022 Neural MMO Challenge, it involved agents from 16 populations surviving in procedurally generated worlds by collecting resources and defeating opponents. This year's competition runs on the latest v1.6 Neural MMO, which introduces new equipment, combat, trading, and a better scoring system. These elements combine to pose additional robustness and generalization challenges not present in previous competitions. This paper summarizes the design and results of the challenge, explores the potential of this environment as a benchmark for learning methods, and presents some practical reinforcement learning training approaches for complex tasks with sparse rewards. Additionally, we have open-sourced our baselines, including environment wrappers, benchmarks, and visualization tools for future research. △ Less

Submitted 6 November, 2023; originally announced November 2023.

arXiv:2308.15802 [pdf, other]

Benchmarking Robustness and Generalization in Multi-Agent Systems: A Case Study on Neural MMO

Authors: Yangkun Chen, Joseph Suarez, Junjie Zhang, Chenghui Yu, Bo Wu, Hanmo Chen, Hengman Zhu, Rui Du, Shanliang Qian, Shuai Liu, Weijun Hong, Jinke He, Yibing Zhang, Liang Zhao, Clare Zhu, Julian Togelius, Sharada Mohanty, Jiaxin Chen, Xiu Li, Xiaolong Zhu, Phillip Isola

Abstract: We present the results of the second Neural MMO challenge, hosted at IJCAI 2022, which received 1600+ submissions. This competition targets robustness and generalization in multi-agent systems: participants train teams of agents to complete a multi-task objective against opponents not seen during training. The competition combines relatively complex environment design with large numbers of agents… ▽ More We present the results of the second Neural MMO challenge, hosted at IJCAI 2022, which received 1600+ submissions. This competition targets robustness and generalization in multi-agent systems: participants train teams of agents to complete a multi-task objective against opponents not seen during training. The competition combines relatively complex environment design with large numbers of agents in the environment. The top submissions demonstrate strong success on this task using mostly standard reinforcement learning (RL) methods combined with domain-specific engineering. We summarize the competition design and results and suggest that, as an academic community, competitions may be a powerful approach to solving hard problems and establishing a solid benchmark for algorithms. We will open-source our benchmark including the environment wrapper, baselines, a visualization tool, and selected policies for further research. △ Less

Submitted 30 August, 2023; originally announced August 2023.

arXiv:2308.13538 [pdf, other]

A Preliminary Study on a Conceptual Game Feature Generation and Recommendation System

Authors: M Charity, Yash Bhartia, Daniel Zhang, Ahmed Khalifa, Julian Togelius

Abstract: This paper introduces a system used to generate game feature suggestions based on a text prompt. Trained on the game descriptions of almost 60k games, it uses the word embeddings of a small GLoVe model to extract features and entities found in thematically similar games which are then passed through a generator model to generate new features for a user's prompt. We perform a short user study compa… ▽ More This paper introduces a system used to generate game feature suggestions based on a text prompt. Trained on the game descriptions of almost 60k games, it uses the word embeddings of a small GLoVe model to extract features and entities found in thematically similar games which are then passed through a generator model to generate new features for a user's prompt. We perform a short user study comparing the features generated from a fine-tuned GPT-2 model, a model using the ConceptNet, and human-authored game features. Although human suggestions won the overall majority of votes, the GPT-2 model outperformed the human suggestions in certain games. This system is part of a larger game design assistant tool that is able to collaborate with users at a conceptual level. △ Less

Submitted 16 August, 2023; originally announced August 2023.

arXiv:2308.08638 [pdf, other]

Fair GANs through model rebalancing for extremely imbalanced class distributions

Authors: Anubhav Jain, Nasir Memon, Julian Togelius

Abstract: Deep generative models require large amounts of training data. This often poses a problem as the collection of datasets can be expensive and difficult, in particular datasets that are representative of the appropriate underlying distribution (e.g. demographic). This introduces biases in datasets which are further propagated in the models. We present an approach to construct an unbiased generative… ▽ More Deep generative models require large amounts of training data. This often poses a problem as the collection of datasets can be expensive and difficult, in particular datasets that are representative of the appropriate underlying distribution (e.g. demographic). This introduces biases in datasets which are further propagated in the models. We present an approach to construct an unbiased generative adversarial network (GAN) from an existing biased GAN by rebalancing the model distribution. We do so by generating balanced data from an existing imbalanced deep generative model using an evolutionary algorithm and then using this data to train a balanced generative model. Additionally, we propose a bias mitigation loss function that minimizes the deviation of the learned class distribution from being equiprobable. We show results for the StyleGAN2 models while training on the Flickr Faces High Quality (FFHQ) dataset for racial fairness and see that the proposed approach improves on the fairness metric by almost 5 times, whilst maintaining image quality. We further validate our approach by applying it to an imbalanced CIFAR10 dataset where we show that we can obtain comparable fairness and image quality as when training on a balanced CIFAR10 dataset which is also twice as large. Lastly, we argue that the traditionally used image quality metrics such as Frechet inception distance (FID) are unsuitable for scenarios where the class distributions are imbalanced and a balanced reference set is not available. △ Less

Submitted 21 December, 2023; v1 submitted 16 August, 2023; originally announced August 2023.

arXiv:2308.04052 [pdf, other]

The Five-Dollar Model: Generating Game Maps and Sprites from Sentence Embeddings

Authors: Timothy Merino, Roman Negri, Dipika Rajesh, M Charity, Julian Togelius

Abstract: The five-dollar model is a lightweight text-to-image generative architecture that generates low dimensional images from an encoded text prompt. This model can successfully generate accurate and aesthetically pleasing content in low dimensional domains, with limited amounts of training data. Despite the small size of both the model and datasets, the generated images are still able to maintain the e… ▽ More The five-dollar model is a lightweight text-to-image generative architecture that generates low dimensional images from an encoded text prompt. This model can successfully generate accurate and aesthetically pleasing content in low dimensional domains, with limited amounts of training data. Despite the small size of both the model and datasets, the generated images are still able to maintain the encoded semantic meaning of the textual prompt. We apply this model to three small datasets: pixel art video game maps, video game sprite images, and down-scaled emoji images and apply novel augmentation strategies to improve the performance of our model on these limited datasets. We evaluate our models performance using cosine similarity score between text-image pairs generated by the CLIP VIT-B/32 model. △ Less

Submitted 8 August, 2023; originally announced August 2023.

Comments: to be published in AIIDE 2023

arXiv:2308.01543 [pdf, other]

doi 10.1145/3582437.3587206

Lode Enhancer: Level Co-creation Through Scaling

Authors: Debosmita Bhaumik, Julian Togelius, Georgios N. Yannakakis, Ahmed Khalifa

Abstract: We explore AI-powered upscaling as a design assistance tool in the context of creating 2D game levels. Deep neural networks are used to upscale artificially downscaled patches of levels from the puzzle platformer game Lode Runner. The trained networks are incorporated into a web-based editor, where the user can create and edit levels at three different levels of resolution: 4x4, 8x8, and 16x16. An… ▽ More We explore AI-powered upscaling as a design assistance tool in the context of creating 2D game levels. Deep neural networks are used to upscale artificially downscaled patches of levels from the puzzle platformer game Lode Runner. The trained networks are incorporated into a web-based editor, where the user can create and edit levels at three different levels of resolution: 4x4, 8x8, and 16x16. An edit at any resolution instantly transfers to the other resolutions. As upscaling requires inventing features that might not be present at lower resolutions, we train neural networks to reproduce these features. We introduce a neural network architecture that is capable of not only learning upscaling but also giving higher priority to less frequent tiles. To investigate the potential of this tool and guide further development, we conduct a qualitative study with 3 designers to understand how they use it. Designers enjoyed co-designing with the tool, liked its underlying concept, and provided feedback for further improvement. △ Less

Submitted 3 August, 2023; originally announced August 2023.

arXiv:2308.01312 [pdf, other]

doi 10.1109/CoG52621.2021.9619009

Lode Encoder: AI-constrained co-creativity

Authors: Debosmita Bhaumik, Ahmed Khalifa, Julian Togelius

Abstract: We present Lode Encoder, a gamified mixed-initiative level creation system for the classic platform-puzzle game Lode Runner. The system is built around several autoencoders which are trained on sets of Lode Runner levels. When fed with the user's design, each autoencoder produces a version of that design which is closer in style to the levels that it was trained on. The Lode Encoder interface allo… ▽ More We present Lode Encoder, a gamified mixed-initiative level creation system for the classic platform-puzzle game Lode Runner. The system is built around several autoencoders which are trained on sets of Lode Runner levels. When fed with the user's design, each autoencoder produces a version of that design which is closer in style to the levels that it was trained on. The Lode Encoder interface allows the user to build and edit levels through 'painting' from the suggestions provided by the autoencoders. Crucially, in order to encourage designers to explore new possibilities, the system does not include more traditional editing tools. We report on the system design and training procedure, as well as on the evolution of the system itself and user tests. △ Less

Submitted 2 August, 2023; originally announced August 2023.

Journal ref: 2021 IEEE Conference on Games (CoG), Copenhagen, Denmark, 2021, pp. 01-08

arXiv:2307.09777 [pdf, other]

Generating Redstone Style Cities in Minecraft

Authors: Shuo Huang, Chengpeng Hu, Julian Togelius, Jialin Liu

Abstract: Procedurally generating cities in Minecraft provides players more diverse scenarios and could help understand and improve the design of cities in other digital worlds and the real world. This paper presents a city generator that was submitted as an entry to the 2023 Edition of Minecraft Settlement Generation Competition for Minecraft. The generation procedure is composed of six main steps, namely… ▽ More Procedurally generating cities in Minecraft provides players more diverse scenarios and could help understand and improve the design of cities in other digital worlds and the real world. This paper presents a city generator that was submitted as an entry to the 2023 Edition of Minecraft Settlement Generation Competition for Minecraft. The generation procedure is composed of six main steps, namely vegetation clearing, terrain reshaping, building layout generation, route planning, streetlight placement, and wall construction. Three algorithms, including a heuristic-based algorithm, an evolving layout algorithm, and a random one are applied to generate the building layout, thus determining where to place different redstone style buildings, and tested by generating cities on random maps in limited time. Experimental results show that the heuristic-based algorithm is capable of finding an acceptable building layout faster for flat maps, while the evolving layout algorithm performs better in evolving layout for rugged maps. A user study is conducted to compare our generator with outstanding entries of the competition's 2022 edition using the competition's evaluation criteria and shows that our generator performs well in the adaptation and functionality criteria △ Less

Submitted 19 July, 2023; originally announced July 2023.

arXiv:2306.13169 [pdf, other]

Amorphous Fortress: Observing Emergent Behavior in Multi-Agent FSMs

Authors: M Charity, Dipika Rajesh, Sam Earle, Julian Togelius

Abstract: We introduce a system called Amorphous Fortress -- an abstract, yet spatial, open-ended artificial life simulation. In this environment, the agents are represented as finite-state machines (FSMs) which allow for multi-agent interaction within a constrained space. These agents are created by randomly generating and evolving the FSMs; sampling from pre-defined states and transitions. This environmen… ▽ More We introduce a system called Amorphous Fortress -- an abstract, yet spatial, open-ended artificial life simulation. In this environment, the agents are represented as finite-state machines (FSMs) which allow for multi-agent interaction within a constrained space. These agents are created by randomly generating and evolving the FSMs; sampling from pre-defined states and transitions. This environment was designed to explore the emergent AI behaviors found implicitly in simulation games such as Dwarf Fortress or The Sims. We apply the hill-climber evolutionary search algorithm to this environment to explore the various levels of depth and interaction from the generated FSMs. △ Less

Submitted 22 June, 2023; originally announced June 2023.

Comments: 9 pages; Accepted to the 1st ALIFE for and from video games Workshop 2023

arXiv:2306.01102 [pdf, other]

doi 10.1145/3638529.3654017

LLMatic: Neural Architecture Search via Large Language Models and Quality Diversity Optimization

Authors: Muhammad U. Nasir, Sam Earle, Christopher Cleghorn, Steven James, Julian Togelius

Abstract: Large Language Models (LLMs) have emerged as powerful tools capable of accomplishing a broad spectrum of tasks. Their abilities span numerous areas, and one area where they have made a significant impact is in the domain of code generation. Here, we propose using the coding abilities of LLMs to introduce meaningful variations to code defining neural networks. Meanwhile, Quality-Diversity (QD) algo… ▽ More Large Language Models (LLMs) have emerged as powerful tools capable of accomplishing a broad spectrum of tasks. Their abilities span numerous areas, and one area where they have made a significant impact is in the domain of code generation. Here, we propose using the coding abilities of LLMs to introduce meaningful variations to code defining neural networks. Meanwhile, Quality-Diversity (QD) algorithms are known to discover diverse and robust solutions. By merging the code-generating abilities of LLMs with the diversity and robustness of QD solutions, we introduce \texttt{LLMatic}, a Neural Architecture Search (NAS) algorithm. While LLMs struggle to conduct NAS directly through prompts, \texttt{LLMatic} uses a procedural approach, leveraging QD for prompts and network architecture to create diverse and high-performing networks. We test \texttt{LLMatic} on the CIFAR-10 and NAS-bench-201 benchmarks, demonstrating that it can produce competitive networks while evaluating just $2,000$ candidates, even without prior knowledge of the benchmark domain or exposure to any previous top-performing models for the benchmark. The open-sourced code is available in \url{https://github.com/umair-nasir14/LLMatic}. △ Less

Submitted 12 April, 2024; v1 submitted 1 June, 2023; originally announced June 2023.

Comments: Accepted to The Genetic and Evolutionary Computation Conference 2024

arXiv:2305.18553 [pdf, other]

Controllable Path of Destruction

Authors: Matthew Siper, Sam Earle, Zehua Jiang, Ahmed Khalifa, Julian Togelius

Abstract: Path of Destruction (PoD) is a self-supervised method for learning iterative generators. The core idea is to produce a training set by destroying a set of artifacts, and for each destructive step create a training instance based on the corresponding repair action. A generator trained on this dataset can then generate new artifacts by repairing from arbitrary states. The PoD method is very data-eff… ▽ More Path of Destruction (PoD) is a self-supervised method for learning iterative generators. The core idea is to produce a training set by destroying a set of artifacts, and for each destructive step create a training instance based on the corresponding repair action. A generator trained on this dataset can then generate new artifacts by repairing from arbitrary states. The PoD method is very data-efficient in terms of original training examples and well-suited to functional artifacts composed of categorical data, such as game levels and discrete 3D structures. In this paper, we extend the Path of Destruction method to allow designer control over aspects of the generated artifacts. Controllability is introduced by adding conditional inputs to the state-action pairs that make up the repair trajectories. We test the controllable PoD method in a 2D dungeon setting, as well as in the domain of small 3D Lego cars. △ Less

Submitted 31 May, 2023; v1 submitted 29 May, 2023; originally announced May 2023.

Comments: 8 pages, 6 figures, and 2 tables. Published at CoG Conference 2023

arXiv:2305.18243 [pdf, other]

Practical PCG Through Large Language Models

Authors: Muhammad U Nasir, Julian Togelius

Abstract: Large Language Models (LLMs) have proven to be useful tools in various domains outside of the field of their inception, which was natural language processing. In this study, we provide practical directions on how to use LLMs to generate 2D-game rooms for an under-development game, named Metavoidal. Our technique can harness the power of GPT-3 by Human-in-the-loop fine-tuning which allows our metho… ▽ More Large Language Models (LLMs) have proven to be useful tools in various domains outside of the field of their inception, which was natural language processing. In this study, we provide practical directions on how to use LLMs to generate 2D-game rooms for an under-development game, named Metavoidal. Our technique can harness the power of GPT-3 by Human-in-the-loop fine-tuning which allows our method to create 37% Playable-Novel levels from as scarce data as only 60 hand-designed rooms under a scenario of the non-trivial game, with respect to (Procedural Content Generation) PCG, that has a good amount of local and global constraints. △ Less

Submitted 2 July, 2023; v1 submitted 20 May, 2023; originally announced May 2023.

Comments: Published at 2023 IEEE Conference on Games

arXiv:2305.07710 [pdf, other]

Zero-shot racially balanced dataset generation using an existing biased StyleGAN2

Authors: Anubhav Jain, Nasir Memon, Julian Togelius

Abstract: Facial recognition systems have made significant strides thanks to data-heavy deep learning models, but these models rely on large privacy-sensitive datasets. Further, many of these datasets lack diversity in terms of ethnicity and demographics, which can lead to biased models that can have serious societal and security implications. To address these issues, we propose a methodology that leverages… ▽ More Facial recognition systems have made significant strides thanks to data-heavy deep learning models, but these models rely on large privacy-sensitive datasets. Further, many of these datasets lack diversity in terms of ethnicity and demographics, which can lead to biased models that can have serious societal and security implications. To address these issues, we propose a methodology that leverages the biased generative model StyleGAN2 to create demographically diverse images of synthetic individuals. The synthetic dataset is created using a novel evolutionary search algorithm that targets specific demographic groups. By training face recognition models with the resulting balanced dataset containing 50,000 identities per race (13.5 million images in total), we can improve their performance and minimize biases that might have been present in a model trained on a real dataset. △ Less

Submitted 18 September, 2023; v1 submitted 12 May, 2023; originally announced May 2023.

arXiv:2305.07392 [pdf, other]

The Ethics of AI in Games

Authors: David Melhart, Julian Togelius, Benedikte Mikkelsen, Christoffer Holmgård, Georgios N. Yannakakis

Abstract: Video games are one of the richest and most popular forms of human-computer interaction and, hence, their role is critical for our understanding of human behaviour and affect at a large scale. As artificial intelligence (AI) tools are gradually adopted by the game industry a series of ethical concerns arise. Such concerns, however, have so far not been extensively discussed in a video game context… ▽ More Video games are one of the richest and most popular forms of human-computer interaction and, hence, their role is critical for our understanding of human behaviour and affect at a large scale. As artificial intelligence (AI) tools are gradually adopted by the game industry a series of ethical concerns arise. Such concerns, however, have so far not been extensively discussed in a video game context. Motivated by the lack of a comprehensive review of the ethics of AI as applied to games, we survey the current state of the art in this area and discuss ethical considerations of these systems from the holistic perspective of the affective loop. Through the components of this loop, we study the ethical challenges that AI faces in video game development. Elicitation highlights the ethical boundaries of artificially induced emotions; sensing showcases the trade-off between privacy and safe gaming spaces; and detection, as utilised during in-game adaptation, poses challenges to transparency and ownership. This paper calls for an open dialogue and action for the games of today and the virtual spaces of the future. By setting an appropriate framework we aim to protect users and to guide developers towards safer and better experiences for their customers. △ Less

Submitted 12 May, 2023; originally announced May 2023.

Comments: Version Accepted for the IEEE Transactions on Affective Computing Special Issue on Ethics in Affective Computing

arXiv:2304.06035 [pdf, other]

Choose Your Weapon: Survival Strategies for Depressed AI Academics

Authors: Julian Togelius, Georgios N. Yannakakis

Abstract: Are you an AI researcher at an academic institution? Are you anxious you are not coping with the current pace of AI advancements? Do you feel you have no (or very limited) access to the computational and human resources required for an AI research breakthrough? You are not alone; we feel the same way. A growing number of AI academics can no longer find the means and resources to compete at a globa… ▽ More Are you an AI researcher at an academic institution? Are you anxious you are not coping with the current pace of AI advancements? Do you feel you have no (or very limited) access to the computational and human resources required for an AI research breakthrough? You are not alone; we feel the same way. A growing number of AI academics can no longer find the means and resources to compete at a global scale. This is a somewhat recent phenomenon, but an accelerating one, with private actors investing enormous compute resources into cutting edge AI research. Here, we discuss what you can do to stay competitive while remaining an academic. We also briefly discuss what universities and the private sector could do improve the situation, if they are so inclined. This is not an exhaustive list of strategies, and you may not agree with all of them, but it serves to start a discussion. △ Less

Submitted 7 February, 2024; v1 submitted 31 March, 2023; originally announced April 2023.

Journal ref: Proceedings of the IEEE, 2024

arXiv:2303.15662 [pdf, other]

ChatGPT4PCG Competition: Character-like Level Generation for Science Birds

Authors: Pittawat Taveekitworachai, Febri Abdullah, Mury F. Dewantoro, Ruck Thawonmas, Julian Togelius, Jochen Renz

Abstract: This paper presents the first ChatGPT4PCG Competition at the 2023 IEEE Conference on Games. The objective of this competition is for participants to create effective prompts for ChatGPT--enabling it to generate Science Birds levels with high stability and character-like qualities--fully using their creativity as well as prompt engineering skills. ChatGPT is a conversational agent developed by Open… ▽ More This paper presents the first ChatGPT4PCG Competition at the 2023 IEEE Conference on Games. The objective of this competition is for participants to create effective prompts for ChatGPT--enabling it to generate Science Birds levels with high stability and character-like qualities--fully using their creativity as well as prompt engineering skills. ChatGPT is a conversational agent developed by OpenAI. Science Birds is selected as the competition platform because designing an Angry Birds-like level is not a trivial task due to the in-game gravity; the quality of the levels is determined by their stability. To lower the entry barrier to the competition, we limit the task to the generation of capitalized English alphabetical characters. We also allow only a single prompt to be used for generating all the characters. Here, the quality of the generated levels is determined by their stability and similarity to the given characters. A sample prompt is provided to participants for their reference. An experiment is conducted to determine the effectiveness of several modified versions of this sample prompt on level stability and similarity by testing them on several characters. To the best of our knowledge, we believe that ChatGPT4PCG is the first competition of its kind and hope to inspire enthusiasm for prompt engineering in procedural content generation. △ Less

Submitted 20 March, 2024; v1 submitted 27 March, 2023; originally announced March 2023.

Comments: This paper accepted for presentation at IEEE CoG 2023 is made available for participants of ChatGPT4PCG Competition (https://chatgpt4pcg.github.io/) and readers interested in relevant areas. In this PDF version, the affiliation symbol of Julian Togelius has been revised

ACM Class: I.2.7; I.2.8

arXiv:2302.05817 [pdf, other]

doi 10.1145/3582437.3587211

Level Generation Through Large Language Models

Authors: Graham Todd, Sam Earle, Muhammad Umair Nasir, Michael Cerny Green, Julian Togelius

Abstract: Large Language Models (LLMs) are powerful tools, capable of leveraging their training on natural language to write stories, generate code, and answer questions. But can they generate functional video game levels? Game levels, with their complex functional constraints and spatial relationships in more than one dimension, are very different from the kinds of data an LLM typically sees during trainin… ▽ More Large Language Models (LLMs) are powerful tools, capable of leveraging their training on natural language to write stories, generate code, and answer questions. But can they generate functional video game levels? Game levels, with their complex functional constraints and spatial relationships in more than one dimension, are very different from the kinds of data an LLM typically sees during training. Datasets of game levels are also hard to come by, potentially taxing the abilities of these data-hungry models. We investigate the use of LLMs to generate levels for the game Sokoban, finding that LLMs are indeed capable of doing so, and that their performance scales dramatically with dataset size. We also perform preliminary experiments on controlling LLM level generators and discuss promising areas for future work. △ Less

Submitted 1 June, 2023; v1 submitted 11 February, 2023; originally announced February 2023.

Journal ref: FDG 2023: Proceedings of the 18th International Conference on the Foundations of Digital Games

arXiv:2301.06820 [pdf, other]

Pathfinding Neural Cellular Automata

Authors: Sam Earle, Ozlem Yildiz, Julian Togelius, Chinmay Hegde

Abstract: Pathfinding makes up an important sub-component of a broad range of complex tasks in AI, such as robot path planning, transport routing, and game playing. While classical algorithms can efficiently compute shortest paths, neural networks could be better suited to adapting these sub-routines to more complex and intractable tasks. As a step toward developing such networks, we hand-code and learn mod… ▽ More Pathfinding makes up an important sub-component of a broad range of complex tasks in AI, such as robot path planning, transport routing, and game playing. While classical algorithms can efficiently compute shortest paths, neural networks could be better suited to adapting these sub-routines to more complex and intractable tasks. As a step toward developing such networks, we hand-code and learn models for Breadth-First Search (BFS), i.e. shortest path finding, using the unified architectural framework of Neural Cellular Automata, which are iterative neural networks with equal-size inputs and outputs. Similarly, we present a neural implementation of Depth-First Search (DFS), and outline how it can be combined with neural BFS to produce an NCA for computing diameter of a graph. We experiment with architectural modifications inspired by these hand-coded NCAs, training networks from scratch to solve the diameter problem on grid mazes while exhibiting strong generalization ability. Finally, we introduce a scheme in which data points are mutated adversarially during training. We find that adversarially evolving mazes leads to increased generalization on out-of-distribution examples, while at the same time generating data-sets with significantly more complex solutions for reasoning tasks. △ Less

Submitted 17 January, 2023; originally announced January 2023.

arXiv:2212.02571 [pdf, other]

A Dataless FaceSwap Detection Approach Using Synthetic Images

Authors: Anubhav Jain, Nasir Memon, Julian Togelius

Abstract: Face swapping technology used to create "Deepfakes" has advanced significantly over the past few years and now enables us to create realistic facial manipulations. Current deep learning algorithms to detect deepfakes have shown promising results, however, they require large amounts of training data, and as we show they are biased towards a particular ethnicity. We propose a deepfake detection meth… ▽ More Face swapping technology used to create "Deepfakes" has advanced significantly over the past few years and now enables us to create realistic facial manipulations. Current deep learning algorithms to detect deepfakes have shown promising results, however, they require large amounts of training data, and as we show they are biased towards a particular ethnicity. We propose a deepfake detection methodology that eliminates the need for any real data by making use of synthetically generated data using StyleGAN3. This not only performs at par with the traditional training methodology of using real data but it shows better generalization capabilities when finetuned with a small amount of real data. Furthermore, this also reduces biases created by facial image datasets that might have sparse data from particular ethnicities. △ Less

Submitted 5 December, 2022; originally announced December 2022.

Comments: IJCB 2022

arXiv:2210.09294 [pdf, other]

Story Designer: Towards a Mixed-Initiative Tool to Create Narrative Structures

Authors: Alberto Alvarez, Jose Font, Julian Togelius

Abstract: Narratives are a predominant part of games, and their design poses challenges when identifying, encoding, interpreting, evaluating, and generating them. One way to address this would be to approach narrative design in a more abstract layer, such as narrative structures. This paper presents Story Designer, a mixed-initiative co-creative narrative structure tool built on top of the Evolutionary Dung… ▽ More Narratives are a predominant part of games, and their design poses challenges when identifying, encoding, interpreting, evaluating, and generating them. One way to address this would be to approach narrative design in a more abstract layer, such as narrative structures. This paper presents Story Designer, a mixed-initiative co-creative narrative structure tool built on top of the Evolutionary Dungeon Designer (EDD) that uses tropes, narrative conventions found across many media types, to design these structures. Story Designer uses tropes as building blocks for narrative designers to compose complete narrative structures by interconnecting them in graph structures called narrative graphs. Our mixed-initiative approach lets designers manually create their narrative graphs and feeds an underlying evolutionary algorithm with those, creating quality-diverse suggestions using MAP-Elites. Suggestions are visually represented for designers to compare and evaluate and can then be incorporated into the design for further manual editions. At the same time, we use the levels designed within EDD as constraints for the narrative structure, intertwining both level design and narrative. We evaluate the impact of these constraints and the system's adaptability and expressiveness, resulting in a potential tool to create narrative structures combining level design aspects with narrative. △ Less

Submitted 11 October, 2022; originally announced October 2022.

Comments: 9 pages, Accepted and to appear in Proceedings of the 17th International Conference on the Foundations of Digital Games (FDG), 2022

arXiv:2209.04911 [pdf, other]

Keke AI Competition: Solving puzzle levels in a dynamically changing mechanic space

Authors: M Charity, Julian Togelius

Abstract: The Keke AI Competition introduces an artificial agent competition for the game Baba is You - a Sokoban-like puzzle game where players can create rules that influence the mechanics of the game. Altering a rule can cause temporary or permanent effects for the rest of the level that could be part of the solution space. The nature of these dynamic rules and the deterministic aspect of the game create… ▽ More The Keke AI Competition introduces an artificial agent competition for the game Baba is You - a Sokoban-like puzzle game where players can create rules that influence the mechanics of the game. Altering a rule can cause temporary or permanent effects for the rest of the level that could be part of the solution space. The nature of these dynamic rules and the deterministic aspect of the game creates a challenge for AI to adapt to a variety of mechanic combinations in order to solve a level. This paper describes the framework and evaluation metrics used to rank submitted agents and baseline results from sample tree search agents. △ Less

Submitted 11 September, 2022; originally announced September 2022.

arXiv:2209.04909 [pdf, other]

Diversity and Novelty MasterPrints: Generating Multiple DeepMasterPrints for Increased User Coverage

Authors: M Charity, Nasir Memon, Zehua Jiang, Abhi Sen, Julian Togelius

Abstract: This work expands on previous advancements in genetic fingerprint spoofing via the DeepMasterPrints and introduces Diversity and Novelty MasterPrints. This system uses quality diversity evolutionary algorithms to generate dictionaries of artificial prints with a focus on increasing coverage of users from the dataset. The Diversity MasterPrints focus on generating solution prints that match with us… ▽ More This work expands on previous advancements in genetic fingerprint spoofing via the DeepMasterPrints and introduces Diversity and Novelty MasterPrints. This system uses quality diversity evolutionary algorithms to generate dictionaries of artificial prints with a focus on increasing coverage of users from the dataset. The Diversity MasterPrints focus on generating solution prints that match with users not covered by previously found prints, and the Novelty MasterPrints explicitly search for prints with more that are farther in user space than previous prints. Our multi-print search methodologies outperform the singular DeepMasterPrints in both coverage and generalization while maintaining quality of the fingerprint image output. △ Less

Submitted 11 September, 2022; originally announced September 2022.

arXiv:2208.05017 [pdf, other]

Aesthetic Bot: Interactively Evolving Game Maps on Twitter

Authors: M Charity, Julian Togelius

Abstract: This paper describes the implementation of the Aesthetic Bot, an automated Twitter account that posts images of small game maps that are either user-made or generated from an evolutionary system. The bot then prompts users to vote via a poll posted in the image's thread for the most aesthetically pleasing map. This creates a rating system that allows for direct interaction with the bot in a way th… ▽ More This paper describes the implementation of the Aesthetic Bot, an automated Twitter account that posts images of small game maps that are either user-made or generated from an evolutionary system. The bot then prompts users to vote via a poll posted in the image's thread for the most aesthetically pleasing map. This creates a rating system that allows for direct interaction with the bot in a way that is integrated seamlessly into a user's regularly updated Twitter content feed. Upon conclusion of the each voting round, the bot learns from the distribution of votes for each map to emulate user preferences for design and visual aesthetic in order to generate maps that would win future vote pairings. We discuss the ongoing results and emerging behaviors that have occurred since the release of this system from both the bot's generation of game maps and the participating Twitter users. △ Less

Submitted 24 August, 2022; v1 submitted 9 August, 2022; originally announced August 2022.

arXiv:2206.13623 [pdf, other]

Learning Controllable 3D Level Generators

Authors: Zehua Jiang, Sam Earle, Michael Cerny Green, Julian Togelius

Abstract: Procedural Content Generation via Reinforcement Learning (PCGRL) foregoes the need for large human-authored data-sets and allows agents to train explicitly on functional constraints, using computable, user-defined measures of quality instead of target output. We explore the application of PCGRL to 3D domains, in which content-generation tasks naturally have greater complexity and potential pertine… ▽ More Procedural Content Generation via Reinforcement Learning (PCGRL) foregoes the need for large human-authored data-sets and allows agents to train explicitly on functional constraints, using computable, user-defined measures of quality instead of target output. We explore the application of PCGRL to 3D domains, in which content-generation tasks naturally have greater complexity and potential pertinence to real-world applications. Here, we introduce several PCGRL tasks for the 3D domain, Minecraft (Mojang Studios, 2009). These tasks will challenge RL-based generators using affordances often found in 3D environments, such as jumping, multiple dimensional movement, and gravity. We train an agent to optimize each of these tasks to explore the capabilities of previous research in PCGRL. This agent is able to generate relatively complex and diverse levels, and generalize to random initial states and control targets. Controllability tests in the presented tasks demonstrate their utility to analyze success and failure for 3D generators. △ Less

Submitted 14 August, 2022; v1 submitted 27 June, 2022; originally announced June 2022.

Comments: 8 pages, 9 figures

arXiv:2206.10608 [pdf, other]

doi 10.1145/3532719.3543244

Generating Diverse Indoor Furniture Arrangements

Authors: Ya-Chuan Hsu, Matthew C. Fontaine, Sam Earle, Maria Edwards, Julian Togelius, Stefanos Nikolaidis

Abstract: We present a method for generating arrangements of indoor furniture from human-designed furniture layout data. Our method creates arrangements that target specified diversity, such as the total price of all furniture in the room and the number of pieces placed. To generate realistic furniture arrangement, we train a generative adversarial network (GAN) on human-designed layouts. To target specific… ▽ More We present a method for generating arrangements of indoor furniture from human-designed furniture layout data. Our method creates arrangements that target specified diversity, such as the total price of all furniture in the room and the number of pieces placed. To generate realistic furniture arrangement, we train a generative adversarial network (GAN) on human-designed layouts. To target specific diversity in the arrangements, we optimize the latent space of the GAN via a quality diversity algorithm to generate a diverse arrangement collection. Experiments show our approach discovers a set of arrangements that are similar to human-designed layouts but varies in price and number of furniture pieces. △ Less

Submitted 20 June, 2022; originally announced June 2022.

arXiv:2206.05497 [pdf, other]

Mutation Models: Learning to Generate Levels by Imitating Evolution

Authors: Ahmed Khalifa, Michael Cerny Green, Julian Togelius

Abstract: Search-based procedural content generation (PCG) is a well-known method for level generation in games. Its key advantage is that it is generic and able to satisfy functional constraints. However, due to the heavy computational costs to run these algorithms online, search-based PCG is rarely utilized for real-time generation. In this paper, we introduce mutation models, a new type of iterative leve… ▽ More Search-based procedural content generation (PCG) is a well-known method for level generation in games. Its key advantage is that it is generic and able to satisfy functional constraints. However, due to the heavy computational costs to run these algorithms online, search-based PCG is rarely utilized for real-time generation. In this paper, we introduce mutation models, a new type of iterative level generator based on machine learning. We train a model to imitate the evolutionary process and use the trained model to generate levels. This trained model is able to modify noisy levels sequentially to create better levels without the need for a fitness function during inference. We evaluate our trained models on a 2D maze generation task. We compare several different versions of the method: training the models either at the end of evolution (normal evolution) or every 100 generations (assisted evolution) and using the model as a mutation function during evolution. Using the assisted evolution process, the final trained models are able to generate mazes with a success rate of 99% and high diversity of 86%. The trained model is many times faster than the evolutionary process it was trained on. This work opens the door to a new way of learning level generators guided by an evolutionary process, meaning automatic creation of generators with specifiable constraints and objectives that are fast enough for runtime deployment in games. △ Less

Submitted 25 August, 2022; v1 submitted 11 June, 2022; originally announced June 2022.

Comments: 8 pages, 6 figures, and 2 tables. Published at PCGWorkshop 2022 at FDG 2022

arXiv:2206.00089 [pdf, other]

Defining Quantum Games

Authors: Laura Piispanen, Marcel Pfaffhauser, James Wootton, Julian Togelius, Annakaisa Kultima

Abstract: In this article, we survey the existing quantum physics related games and based on them propose a definition for the concept of quantum games. We define quantum games as any type of rule-based games that use the principles or reference the theory of quantum physics or quantum phenomena through any of three proposed dimensions: the perceivable dimension of quantum physics, the dimension of quantum… ▽ More In this article, we survey the existing quantum physics related games and based on them propose a definition for the concept of quantum games. We define quantum games as any type of rule-based games that use the principles or reference the theory of quantum physics or quantum phenomena through any of three proposed dimensions: the perceivable dimension of quantum physics, the dimension of quantum technologies, and the dimension of scientific purposes like citizen science or education. We also discuss the concept of quantum computer games, games on quantum computers and discuss the definitions for the concept of science games. At the same time, there are various games exploring quantum physics and quantum computing through digital, analogue, and hybrid means with diverse incentives driving their development. As interest in games as educational tools for supporting quantum literacy grows, understanding the diverse landscape of quantum games becomes increasingly important. We propose that three dimensions of quantum games identified in this article are used for designing, analysing and defining the phenomenon of quantum games. △ Less

Submitted 11 April, 2024; v1 submitted 31 May, 2022; originally announced June 2022.

Comments: 21 pages + references, 24 pictures in 6 figures, 3 tables

arXiv:2204.13250 [pdf, other]

Watts: Infrastructure for Open-Ended Learning

Authors: Aaron Dharna, Charlie Summers, Rohin Dasari, Julian Togelius, Amy K. Hoover

Abstract: This paper proposes a framework called Watts for implementing, comparing, and recombining open-ended learning (OEL) algorithms. Motivated by modularity and algorithmic flexibility, Watts atomizes the components of OEL systems to promote the study of and direct comparisons between approaches. Examining implementations of three OEL algorithms, the paper introduces the modules of the framework. The h… ▽ More This paper proposes a framework called Watts for implementing, comparing, and recombining open-ended learning (OEL) algorithms. Motivated by modularity and algorithmic flexibility, Watts atomizes the components of OEL systems to promote the study of and direct comparisons between approaches. Examining implementations of three OEL algorithms, the paper introduces the modules of the framework. The hope is for Watts to enable benchmarking and to explore new types of OEL algorithms. The repo is available at \url{https://github.com/aadharna/watts} △ Less

Submitted 27 April, 2022; originally announced April 2022.

Comments: ICLR Workshop on Agent Learning in Open-Endedness (ALOE 2022)

arXiv:2204.05217 [pdf, other]

Persona-driven Dominant/Submissive Map (PDSM) Generation for Tutorials

Authors: Michael Cerny Green, Ahmed Khalifa, M Charity, Julian Togelius

Abstract: In this paper, we present a method for automated persona-driven video game tutorial level generation. Tutorial levels are scenarios in which the player can explore and discover different rules and game mechanics. Procedural personas can guide generators to create content which encourages or discourages certain playstyle behaviors. In this system, we use procedural personas to calculate the behavio… ▽ More In this paper, we present a method for automated persona-driven video game tutorial level generation. Tutorial levels are scenarios in which the player can explore and discover different rules and game mechanics. Procedural personas can guide generators to create content which encourages or discourages certain playstyle behaviors. In this system, we use procedural personas to calculate the behavioral characteristics of levels which are evolved using the quality-diversity algorithm known as Constrained MAP-Elites. An evolved map's quality is determined by its simplicity: the simpler it is, the better it is. Within this work, we show that the generated maps can strongly encourage or discourage different persona-like behaviors and range from simple solutions to complex puzzle-levels, making them perfect candidates for a tutorial generative system. △ Less

Submitted 11 April, 2022; originally announced April 2022.

Comments: 10 pages, 7 figures, 2 tables

arXiv:2203.13351 [pdf, other]

Predicting Personas Using Mechanic Frequencies and Game State Traces

Authors: Michael Cerny Green, Ahmed Khalifa, M Charity, Debosmita Bhaumik, Julian Togelius

Abstract: We investigate how to efficiently predict play personas based on playtraces. Play personas can be computed by calculating the action agreement ratio between a player and a generative model of playing behavior, a so-called procedural persona. But this is computationally expensive and assumes that appropriate procedural personas are readily available. We present two methods for estimating player per… ▽ More We investigate how to efficiently predict play personas based on playtraces. Play personas can be computed by calculating the action agreement ratio between a player and a generative model of playing behavior, a so-called procedural persona. But this is computationally expensive and assumes that appropriate procedural personas are readily available. We present two methods for estimating player persona, one using regular supervised learning and aggregate measures of game mechanics initiated, and another based on sequence learning on a trace of closely cropped gameplay observations. While both of these methods achieve high accuracy when predicting play personas defined by agreement with procedural personas, they utterly fail to predict play style as defined by the players themselves using a questionnaire. This interesting result highlights the value of using computational methods in defining play personas. △ Less

Submitted 15 June, 2022; v1 submitted 24 March, 2022; originally announced March 2022.

Comments: 8 pages, 3 tables, 2 figures

arXiv:2203.10941 [pdf, other]

Transfer Dynamics in Emergent Evolutionary Curricula

Authors: Aaron Dharna, Amy K Hoover, Julian Togelius, L. B. Soros

Abstract: PINSKY is a system for open-ended learning through neuroevolution in game-based domains. It builds on the Paired Open-Ended Trailblazer (POET) system, which originally explored learning and environment generation for bipedal walkers, and adapts it to games in the General Video Game AI (GVGAI) system. Previous work showed that by co-evolving levels and neural network policies, levels could be found… ▽ More PINSKY is a system for open-ended learning through neuroevolution in game-based domains. It builds on the Paired Open-Ended Trailblazer (POET) system, which originally explored learning and environment generation for bipedal walkers, and adapts it to games in the General Video Game AI (GVGAI) system. Previous work showed that by co-evolving levels and neural network policies, levels could be found for which successful policies could not be created via optimization alone. Studied in the realm of Artificial Life as a potentially open-ended alternative to gradient-based fitness, minimal criteria (MC)-based selection helps foster diversity in evolutionary populations. The main question addressed by this paper is how the open-ended learning actually works, focusing in particular on the role of transfer of policies from one evolutionary branch ("species") to another. We analyze the dynamics of the system through creating phylogenetic trees, analyzing evolutionary trajectories of policies, and temporally breaking down transfers according to species type. Furthermore, we analyze the impact of the minimal criterion on generated level diversity and inter-species transfer. The most insightful finding is that inter-species transfer, while rare, is crucial to the system's success. △ Less

Submitted 3 March, 2022; originally announced March 2022.

Comments: 14 pages, 9 figures, published in IEEE Transaction on Games

arXiv:2203.02035 [pdf, other]

Baba is Y'all 2.0: Design and Investigation of a Collaborative Mixed-Initiative System

Authors: M Charity, Isha Dave, Ahmed Khalifa, Julian Togelius

Abstract: This paper describes a new version of the mixed-initiative collaborative level designing system: Baba is Y'all, as well as the results of a user study on the system. Baba is Y'all is a prototype for AI-assisted game design in collaboration with others. The updated version includes a more user-friendly interface, a better level-evolver and recommendation system, and extended site features. The syst… ▽ More This paper describes a new version of the mixed-initiative collaborative level designing system: Baba is Y'all, as well as the results of a user study on the system. Baba is Y'all is a prototype for AI-assisted game design in collaboration with others. The updated version includes a more user-friendly interface, a better level-evolver and recommendation system, and extended site features. The system was evaluated via a user study where participants were required to play a previously submitted level from the site and then create their own levels using the editor. They reported on their individual process creating the level and their overall experience interacting with the site. The results have shown both the benefits and limitations of this mixed-initiative system and how it can help with creating a diversity of `Baba is You' levels that are both human and AI designed while maintaining their quality. △ Less

Submitted 10 October, 2022; v1 submitted 3 March, 2022; originally announced March 2022.

Comments: 15 pages

arXiv:2202.10184 [pdf, other]

Path of Destruction: Learning an Iterative Level Generator Using a Small Dataset

Authors: Matthew Siper, Ahmed Khalifa, Julian Togelius

Abstract: We propose a new procedural content generation method which learns iterative level generators from a dataset of existing levels. The Path of Destruction method, as we call it, views level generation as repair; levels are created by iteratively repairing from a random starting level. The first step is to generate an artificial dataset from the original set of levels by introducing many different se… ▽ More We propose a new procedural content generation method which learns iterative level generators from a dataset of existing levels. The Path of Destruction method, as we call it, views level generation as repair; levels are created by iteratively repairing from a random starting level. The first step is to generate an artificial dataset from the original set of levels by introducing many different sequences of mutations to existing levels. In the generated dataset, features are observations of destroyed levels and targets are the specific actions that repair the mutated tile in the middle of the observations. Using this dataset, a convolutional network is trained to map from observations to their respective appropriate repair actions. The trained network is then used to iteratively produce levels from random starting maps. We demonstrate this method by applying it to generate unique and playable tile-based levels for several 2D games (Zelda, Danger Dave, and Sokoban) and vary key hyperparameters. △ Less

Submitted 3 October, 2022; v1 submitted 21 February, 2022; originally announced February 2022.

Comments: 7 pages, 7 figures, and 3 tables. Published at SSCI Conference 2022

arXiv:2202.03666 [pdf, other]

Approximating Gradients for Differentiable Quality Diversity in Reinforcement Learning

Authors: Bryon Tjanaka, Matthew C. Fontaine, Julian Togelius, Stefanos Nikolaidis

Abstract: Consider the problem of training robustly capable agents. One approach is to generate a diverse collection of agent polices. Training can then be viewed as a quality diversity (QD) optimization problem, where we search for a collection of performant policies that are diverse with respect to quantified behavior. Recent work shows that differentiable quality diversity (DQD) algorithms greatly accele… ▽ More Consider the problem of training robustly capable agents. One approach is to generate a diverse collection of agent polices. Training can then be viewed as a quality diversity (QD) optimization problem, where we search for a collection of performant policies that are diverse with respect to quantified behavior. Recent work shows that differentiable quality diversity (DQD) algorithms greatly accelerate QD optimization when exact gradients are available. However, agent policies typically assume that the environment is not differentiable. To apply DQD algorithms to training agent policies, we must approximate gradients for performance and behavior. We propose two variants of the current state-of-the-art DQD algorithm that compute gradients via approximation methods common in reinforcement learning (RL). We evaluate our approach on four simulated locomotion tasks. One variant achieves results comparable to the current state-of-the-art in combining QD and RL, while the other performs comparably in two locomotion tasks. These results provide insight into the limitations of current DQD algorithms in domains where gradients must be approximated. Source code is available at https://github.com/icaros-usc/dqd-rl △ Less

Submitted 15 April, 2022; v1 submitted 8 February, 2022; originally announced February 2022.

Comments: Published as a conference paper at the 2022 Genetic and Evolutionary Computation Conference (GECCO '22); Online article available at http://dqd-rl.github.io

arXiv:2109.05489 [pdf, other]

Illuminating Diverse Neural Cellular Automata for Level Generation

Authors: Sam Earle, Justin Snider, Matthew C. Fontaine, Stefanos Nikolaidis, Julian Togelius

Abstract: We present a method of generating diverse collections of neural cellular automata (NCA) to design video game levels. While NCAs have so far only been trained via supervised learning, we present a quality diversity (QD) approach to generating a collection of NCA level generators. By framing the problem as a QD problem, our approach can train diverse level generators, whose output levels vary based… ▽ More We present a method of generating diverse collections of neural cellular automata (NCA) to design video game levels. While NCAs have so far only been trained via supervised learning, we present a quality diversity (QD) approach to generating a collection of NCA level generators. By framing the problem as a QD problem, our approach can train diverse level generators, whose output levels vary based on aesthetic or functional criteria. To efficiently generate NCAs, we train generators via Covariance Matrix Adaptation MAP-Elites (CMA-ME), a quality diversity algorithm which specializes in continuous search spaces. We apply our new method to generate level generators for several 2D tile-based games: a maze game, Sokoban, and Zelda. Our results show that CMA-ME can generate small NCAs that are diverse yet capable, often satisfying complex solvability criteria for deterministic agents. We compare against a Compositional Pattern-Producing Network (CPPN) baseline trained to produce diverse collections of generators and show that the NCA representation yields a better exploration of level-space. △ Less

Submitted 17 February, 2022; v1 submitted 12 September, 2021; originally announced September 2021.

Comments: 9 pages, 7 figures

arXiv:2108.02955 [pdf, other]

Impressions of the GDMC AI Settlement Generation Challenge in Minecraft

Authors: Christoph Salge, Claus Aranha, Adrian Brightmoore, Sean Butler, Rodrigo Canaan, Michael Cook, Michael Cerny Green, Hagen Fischer, Christian Guckelsberger, Jupiter Hadley, Jean-Baptiste Hervé, Mark R Johnson, Quinn Kybartas, David Mason, Mike Preuss, Tristan Smith, Ruck Thawonmas, Julian Togelius

Abstract: The GDMC AI settlement generation challenge is a PCG competition about producing an algorithm that can create an "interesting" Minecraft settlement for a given map. This paper contains a collection of written experiences with this competition, by participants, judges, organizers and advisors. We asked people to reflect both on the artifacts themselves, and on the competition in general. The aim of… ▽ More The GDMC AI settlement generation challenge is a PCG competition about producing an algorithm that can create an "interesting" Minecraft settlement for a given map. This paper contains a collection of written experiences with this competition, by participants, judges, organizers and advisors. We asked people to reflect both on the artifacts themselves, and on the competition in general. The aim of this paper is to offer a shareable and edited collection of experiences and qualitative feedback - which seem to contain a lot of insights on PCG and computational creativity, but would otherwise be lost once the output of the competition is reduced to scalar performance values. We reflect upon some organizational issues for AI competitions, and discuss the future of the GDMC competition. △ Less

Submitted 6 August, 2021; originally announced August 2021.

Comments: 28 pages, 5 figures

arXiv:2107.04964 [pdf, other]

Self-Referential Quality Diversity Through Differential Map-Elites

Authors: Tae Jong Choi, Julian Togelius

Abstract: Differential MAP-Elites is a novel algorithm that combines the illumination capacity of CVT-MAP-Elites with the continuous-space optimization capacity of Differential Evolution. The algorithm is motivated by observations that illumination algorithms, and quality-diversity algorithms in general, offer qualitatively new capabilities and applications for evolutionary computation yet are in their orig… ▽ More Differential MAP-Elites is a novel algorithm that combines the illumination capacity of CVT-MAP-Elites with the continuous-space optimization capacity of Differential Evolution. The algorithm is motivated by observations that illumination algorithms, and quality-diversity algorithms in general, offer qualitatively new capabilities and applications for evolutionary computation yet are in their original versions relatively unsophisticated optimizers. The basic Differential MAP-Elites algorithm, introduced for the first time here, is relatively simple in that it simply combines the operators from Differential Evolution with the map structure of CVT-MAP-Elites. Experiments based on 25 numerical optimization problems suggest that Differential MAP-Elites clearly outperforms CVT-MAP-Elites, finding better-quality and more diverse solutions. △ Less

Submitted 11 July, 2021; originally announced July 2021.

Showing 1–50 of 139 results for author: Togelius, J