-
Authorship Obfuscation in Multilingual Machine-Generated Text Detection
Authors:
Dominik Macko,
Robert Moro,
Adaku Uchendu,
Ivan Srba,
Jason Samuel Lucas,
Michiharu Yamashita,
Nafis Irtiza Tripto,
Dongwon Lee,
Jakub Simko,
Maria Bielikova
Abstract:
High-quality text generation capability of recent Large Language Models (LLMs) causes concerns about their misuse (e.g., in massive generation/spread of disinformation). Machine-generated text (MGT) detection is important to cope with such threats. However, it is susceptible to authorship obfuscation (AO) methods, such as paraphrasing, which can cause MGTs to evade detection. So far, this was eval…
▽ More
High-quality text generation capability of recent Large Language Models (LLMs) causes concerns about their misuse (e.g., in massive generation/spread of disinformation). Machine-generated text (MGT) detection is important to cope with such threats. However, it is susceptible to authorship obfuscation (AO) methods, such as paraphrasing, which can cause MGTs to evade detection. So far, this was evaluated only in monolingual settings. Thus, the susceptibility of recently proposed multilingual detectors is still unknown. We fill this gap by comprehensively benchmarking the performance of 10 well-known AO methods, attacking 37 MGT detection methods against MGTs in 11 languages (i.e., 10 $\times$ 37 $\times$ 11 = 4,070 combinations). We also evaluate the effect of data augmentation on adversarial robustness using obfuscated texts. The results indicate that all tested AO methods can cause evasion of automated detection in all tested languages, where homoglyph attacks are especially successful. However, some of the AO methods severely damaged the text, making it no longer readable or easily recognizable by humans (e.g., changed language, weird characters).
△ Less
Submitted 18 June, 2024; v1 submitted 15 January, 2024;
originally announced January 2024.
-
ADAPTER-RL: Adaptation of Any Agent using Reinforcement Learning
Authors:
Yizhao **,
Greg Slabaugh,
Simon Lucas
Abstract:
Deep Reinforcement Learning (DRL) agents frequently face challenges in adapting to tasks outside their training distribution, including issues with over-fitting, catastrophic forgetting and sample inefficiency. Although the application of adapters has proven effective in supervised learning contexts such as natural language processing and computer vision, their potential within the DRL domain rema…
▽ More
Deep Reinforcement Learning (DRL) agents frequently face challenges in adapting to tasks outside their training distribution, including issues with over-fitting, catastrophic forgetting and sample inefficiency. Although the application of adapters has proven effective in supervised learning contexts such as natural language processing and computer vision, their potential within the DRL domain remains largely unexplored. This paper delves into the integration of adapters in reinforcement learning, presenting an innovative adaptation strategy that demonstrates enhanced training efficiency and improvement of the base-agent, experimentally in the nanoRTS environment, a real-time strategy (RTS) game simulation. Our proposed universal approach is not only compatible with pre-trained neural networks but also with rule-based agents, offering a means to integrate human expertise.
△ Less
Submitted 19 November, 2023;
originally announced November 2023.
-
MULTITuDE: Large-Scale Multilingual Machine-Generated Text Detection Benchmark
Authors:
Dominik Macko,
Robert Moro,
Adaku Uchendu,
Jason Samuel Lucas,
Michiharu Yamashita,
Matúš Pikuliak,
Ivan Srba,
Thai Le,
Dongwon Lee,
Jakub Simko,
Maria Bielikova
Abstract:
There is a lack of research into capabilities of recent LLMs to generate convincing text in languages other than English and into performance of detectors of machine-generated text in multilingual settings. This is also reflected in the available benchmarks which lack authentic texts in languages other than English and predominantly cover older generators. To fill this gap, we introduce MULTITuDE,…
▽ More
There is a lack of research into capabilities of recent LLMs to generate convincing text in languages other than English and into performance of detectors of machine-generated text in multilingual settings. This is also reflected in the available benchmarks which lack authentic texts in languages other than English and predominantly cover older generators. To fill this gap, we introduce MULTITuDE, a novel benchmarking dataset for multilingual machine-generated text detection comprising of 74,081 authentic and machine-generated texts in 11 languages (ar, ca, cs, de, en, es, nl, pt, ru, uk, and zh) generated by 8 multilingual LLMs. Using this benchmark, we compare the performance of zero-shot (statistical and black-box) and fine-tuned detectors. Considering the multilinguality, we evaluate 1) how these detectors generalize to unseen languages (linguistically similar as well as dissimilar) and unseen LLMs and 2) whether the detectors improve their performance when trained on multiple languages.
△ Less
Submitted 20 October, 2023;
originally announced October 2023.
-
Proving Confluence in the Confluence Framework with CONFident
Authors:
Raúl Gutiérrez,
Salvador Lucas,
Miguel Vítores
Abstract:
This article describes the *Confluence Framework*, a novel framework for proving and disproving confluence using a divide-and-conquer modular strategy, and its implementation in CONFident. Using this approach, we are able to automatically prove and disprove confluence of *Generalized Term Rewriting Systems*, where (i) only selected arguments of function symbols can be rewritten and (ii) a rather g…
▽ More
This article describes the *Confluence Framework*, a novel framework for proving and disproving confluence using a divide-and-conquer modular strategy, and its implementation in CONFident. Using this approach, we are able to automatically prove and disprove confluence of *Generalized Term Rewriting Systems*, where (i) only selected arguments of function symbols can be rewritten and (ii) a rather general class of conditional rules can be used. This includes, as particular cases, several variants of rewrite systems such as (context-sensitive) *term rewriting systems*, *string rewriting systems*, and (context-sensitive) *conditional term rewriting systems*. The divide-and-conquer modular strategy allows us to combine in a proof tree different techniques for proving confluence, including modular decompositions, checking joinability of (conditional) critical and variable pairs, transformations, etc., and auxiliary tasks required by them, e.g., joinability of terms, joinability of conditional pairs, etc.
△ Less
Submitted 8 May, 2024; v1 submitted 28 June, 2023;
originally announced June 2023.
-
Partial advantage estimator for proximal policy optimization
Authors:
Xiulei Song,
Yizhao **,
Greg Slabaugh,
Simon Lucas
Abstract:
Estimation of value in policy gradient methods is a fundamental problem. Generalized Advantage Estimation (GAE) is an exponentially-weighted estimator of an advantage function similar to $λ$-return. It substantially reduces the variance of policy gradient estimates at the expense of bias. In practical applications, a truncated GAE is used due to the incompleteness of the trajectory, which results…
▽ More
Estimation of value in policy gradient methods is a fundamental problem. Generalized Advantage Estimation (GAE) is an exponentially-weighted estimator of an advantage function similar to $λ$-return. It substantially reduces the variance of policy gradient estimates at the expense of bias. In practical applications, a truncated GAE is used due to the incompleteness of the trajectory, which results in a large bias during estimation. To address this challenge, instead of using the entire truncated GAE, we propose to take a part of it when calculating updates, which significantly reduces the bias resulting from the incomplete trajectory. We perform experiments in MuJoCo and $μ$RTS to investigate the effect of different partial coefficient and sampling lengths. We show that our partial GAE approach yields better empirical results in both environments.
△ Less
Submitted 25 January, 2023;
originally announced January 2023.
-
Joint action loss for proximal policy optimization
Authors:
Xiulei Song,
Yizhao **,
Greg Slabaugh,
Simon Lucas
Abstract:
PPO (Proximal Policy Optimization) is a state-of-the-art policy gradient algorithm that has been successfully applied to complex computer games such as Dota 2 and Honor of Kings. In these environments, an agent makes compound actions consisting of multiple sub-actions. PPO uses clip** to restrict policy updates. Although clip** is simple and effective, it is not efficient in its sample use. Fo…
▽ More
PPO (Proximal Policy Optimization) is a state-of-the-art policy gradient algorithm that has been successfully applied to complex computer games such as Dota 2 and Honor of Kings. In these environments, an agent makes compound actions consisting of multiple sub-actions. PPO uses clip** to restrict policy updates. Although clip** is simple and effective, it is not efficient in its sample use. For compound actions, most PPO implementations consider the joint probability (density) of sub-actions, which means that if the ratio of a sample (state compound-action pair) exceeds the range, the gradient the sample produces is zero. Instead, for each sub-action we calculate the loss separately, which is less prone to clip** during updates thereby making better use of samples. Further, we propose a multi-action mixed loss that combines joint and separate probabilities. We perform experiments in Gym-$μ$RTS and MuJoCo. Our hybrid model improves performance by more than 50\% in different MuJoCo environments compared to OpenAI's PPO benchmark results. And in Gym-$μ$RTS, we find the sub-action loss outperforms the standard PPO approach, especially when the clip range is large. Our findings suggest this method can better balance the use-efficiency and quality of samples.
△ Less
Submitted 25 January, 2023;
originally announced January 2023.
-
Airborne absolute gravimetry with a quantum sensor, comparison with classical technologies
Authors:
Yannick Bidel,
Nassim Zahzam,
Alexandre Bresson,
Cédric Blanchard,
Alexis Bonnin,
Jeanne Bernard,
Malo Cadoret,
Tim Enzlberger Jensen,
René Forsberg,
Corinne Salaun,
Sylvain Lucas,
Marie Francoise Lequentrec-Lalancette,
Didier Rouxel,
Germinal Gabalda,
Lucia Seoane,
Dinh Toan Vu,
Sylvain Bonvalot
Abstract:
We report an airborne gravity survey with an absolute gravimeter based on atom interferometry and two relative gravimeters: a classical LaCoste\&Romberg (L\&R) and a novel iMAR strap-down Inertial Measurement Unit (IMU). We estimated measurement errors for the quantum gravimeter ranging from 0.6 to 1.3 mGal depending on the flight conditions and the filtering used. Similar measurement errors are o…
▽ More
We report an airborne gravity survey with an absolute gravimeter based on atom interferometry and two relative gravimeters: a classical LaCoste\&Romberg (L\&R) and a novel iMAR strap-down Inertial Measurement Unit (IMU). We estimated measurement errors for the quantum gravimeter ranging from 0.6 to 1.3 mGal depending on the flight conditions and the filtering used. Similar measurement errors are obtained with iMAR strapdown gravimeter but the long term stability is five times worse. The traditional L\&R platform gravimeter shows larger measurement errors (3 - 4 mGal). Airborne measurements have been compared to marine, land and altimetry derived gravity data. We obtain a good agreement for the quantum gravimeter with standard deviations and means on differences below or equal to 2 mGal. This study confirms the potential of quantum technology for absolute airborne gravimetry which is particularly interesting for map** shallow water or mountainous areas and for linking ground and satellite measurements with homogeneous absolute referencing.
△ Less
Submitted 14 October, 2022;
originally announced October 2022.
-
Radio-Frequency Sweeps at μT Fields for Parahydrogen-Induced Polarization of Biomolecules
Authors:
Alastair Marshall,
Alon Salhov,
Martin Gierse,
Christoph Müller,
Michael Keim,
Sebastian Lucas,
Anna Parker,
Jochen Scheuer,
Christophoros Vassiliou,
Philipp Neumann,
Fedor Jelezko,
Alex Retzker,
John W. Blanchard,
Ilai Schwartz,
Stephan Knecht
Abstract:
Magnetic resonance imaging of $^{13}$C-labeled metabolites enhanced by parahydrogen-induced polarization (PHIP) can enable real-time monitoring of processes within the body. We introduce a robust, easily implementable technique for transferring parahydrogen-derived singlet order into 13C magnetization using adiabatic radio-frequency sweeps at $μ$T fields. We experimentally demonstrate the applicab…
▽ More
Magnetic resonance imaging of $^{13}$C-labeled metabolites enhanced by parahydrogen-induced polarization (PHIP) can enable real-time monitoring of processes within the body. We introduce a robust, easily implementable technique for transferring parahydrogen-derived singlet order into 13C magnetization using adiabatic radio-frequency sweeps at $μ$T fields. We experimentally demonstrate the applicability of this technique to several molecules, including some molecules relevant for metabolic imaging, where we show significant improvements in the achievable polarization, in some cases reaching above 60%. Furthermore, we introduce a site-selective deuteration scheme, where deuterium is included in the coupling network of a pyruvate ester to enhance the efficiency of the polarization transfer. These improvements are enabled by the fact that the transfer protocol avoids relaxation induced by strongly coupled quadrupolar nuclei.
△ Less
Submitted 31 May, 2022;
originally announced May 2022.
-
Visualising Multiplayer Game Spaces
Authors:
James Goodman,
Diego Perez-Liebana,
Simon Lucas
Abstract:
We compare four different `game-spaces' in terms of their usefulness in characterising multi-player tabletop games, with a particular interest in any underlying change to a game's characteristics as the number of players changes. In each case we take a 16-dimensional feature space, and reduce it to a 2-dimensional visualizable landscape.
We find that a space obtained from optimization of paramet…
▽ More
We compare four different `game-spaces' in terms of their usefulness in characterising multi-player tabletop games, with a particular interest in any underlying change to a game's characteristics as the number of players changes. In each case we take a 16-dimensional feature space, and reduce it to a 2-dimensional visualizable landscape.
We find that a space obtained from optimization of parameters in Monte Carlo Tree Search (MCTS) is the most directly interpretable to characterise our set of games in terms of the relative importance of imperfect information, adversarial opponents and reward sparsity. These results do not correlate with a space defined using attributes of the game-tree.
This dimensionality reduction does not show any general effect as the number of players. We therefore consider the question using the original features to classify the games into two sets; those for which the characteristics of the game changes significantly as the number of players changes, and those for which there is no such effect.
△ Less
Submitted 11 February, 2022;
originally announced February 2022.
-
Hyperpolarized solution-state NMR spectroscopy with optically polarized crystals
Authors:
Tim R. Eichhorn,
Anna J. Parker,
Felix Josten,
Christoph Müller,
Jochen Scheuer,
Jakob M. Steiner,
Martin Gierse,
Jonas Handwerker,
Michael Keim,
Sebastian Lucas,
Mohammad Usman Qureshi,
Alastair Marshall,
Alon Salhov,
Yifan Quan,
Jan Binder,
Kay Jahnke,
Philipp Neumann,
Stephan Knecht,
John W. Blanchard,
Martin B. Plenio,
Fedor Jelezko,
Lyndon Emsley,
Christophoros C. Vassiliou,
Patrick Hautle,
Ilai Schwartz
Abstract:
Nuclear spin hyperpolarization provides a promising route to overcome the challenges imposed by the limited sensitivity of nuclear magnetic resonance. Here we demonstrate that dissolution of spin-polarized pentacene-doped naphthalene crystals enables transfer of polarization to target molecules via intermolecular cross relaxation at room temperature and moderate magnetic fields (1.45$\,$T). This m…
▽ More
Nuclear spin hyperpolarization provides a promising route to overcome the challenges imposed by the limited sensitivity of nuclear magnetic resonance. Here we demonstrate that dissolution of spin-polarized pentacene-doped naphthalene crystals enables transfer of polarization to target molecules via intermolecular cross relaxation at room temperature and moderate magnetic fields (1.45$\,$T). This makes it possible to exploit the high spin polarization of optically polarized crystals while mitigating the challenges of its transfer to external nuclei, particularly of the large distances and prohibitively weak coupling between source and target nuclei across solid-solid or solid-liquid interfaces. With this method, here we inject the highly polarized mixture into a benchtop NMR spectrometer and observe the polarization dynamics for target $^1$H nuclei. Although the spectra are radiation damped due to the high naphthalene magnetization, we describe a procedure to process the data in order to obtain more conventional NMR spectra, and extract the target nuclei polarization. With the entire process occurring on a timescale of one minute, we observe NMR signals enhanced by factors between -200 and -1730 at 1.45$\,$T for a range of small molecules.
△ Less
Submitted 17 August, 2021; v1 submitted 13 August, 2021;
originally announced August 2021.
-
Predictive Control Using Learned State Space Models via Rolling Horizon Evolution
Authors:
Alvaro Ovalle,
Simon M. Lucas
Abstract:
A large part of the interest in model-based reinforcement learning derives from the potential utility to acquire a forward model capable of strategic long term decision making. Assuming that an agent succeeds in learning a useful predictive model, it still requires a mechanism to harness it to generate and select among competing simulated plans. In this paper, we explore this theme combining evolu…
▽ More
A large part of the interest in model-based reinforcement learning derives from the potential utility to acquire a forward model capable of strategic long term decision making. Assuming that an agent succeeds in learning a useful predictive model, it still requires a mechanism to harness it to generate and select among competing simulated plans. In this paper, we explore this theme combining evolutionary algorithmic planning techniques with models learned via deep learning and variational inference. We demonstrate the approach with an agent that reliably performs online planning in a set of visual navigation tasks.
△ Less
Submitted 25 June, 2021;
originally announced June 2021.
-
Rinascimento: searching the behaviour space of Splendor
Authors:
Ivan Bravi,
Simon Lucas
Abstract:
The use of Artificial Intelligence (AI) for play-testing is still on the sidelines of main applications of AI in games compared to performance-oriented game-playing. One of the main purposes of play-testing a game is gathering data on the gameplay, highlighting good and bad features of the design of the game, providing useful insight to the game designers for improving the design. Using AI agents…
▽ More
The use of Artificial Intelligence (AI) for play-testing is still on the sidelines of main applications of AI in games compared to performance-oriented game-playing. One of the main purposes of play-testing a game is gathering data on the gameplay, highlighting good and bad features of the design of the game, providing useful insight to the game designers for improving the design. Using AI agents has the potential of speeding the process dramatically. The purpose of this research is to map the behavioural space (BSpace) of a game by using a general method. Using the MAP-Elites algorithm we search the hyperparameter space Rinascimento AI agents and map it to the BSpace defined by several behavioural metrics. This methodology was able to highlight both exemplary and degenerated behaviours in the original game design of Splendor and two variations. In particular, the use of event-value functions has generally shown a remarkable improvement in the coverage of the BSpace compared to agents based on classic score-based reward signals.
△ Less
Submitted 15 June, 2021;
originally announced June 2021.
-
Griddly: A platform for AI research in games
Authors:
Chris Bamford,
Shengyi Huang,
Simon Lucas
Abstract:
In recent years, there have been immense breakthroughs in Game AI research, particularly with Reinforcement Learning (RL). Despite their success, the underlying games are usually implemented with their own preset environments and game mechanics, thus making it difficult for researchers to prototype different game environments. However, testing the RL agents against a variety of game environments i…
▽ More
In recent years, there have been immense breakthroughs in Game AI research, particularly with Reinforcement Learning (RL). Despite their success, the underlying games are usually implemented with their own preset environments and game mechanics, thus making it difficult for researchers to prototype different game environments. However, testing the RL agents against a variety of game environments is critical for recent effort to study generalization in RL and avoid the problem of overfitting that may otherwise occur. In this paper, we present Griddly as a new platform for Game AI research that provides a unique combination of highly configurable games, different observer types and an efficient C++ core engine. Additionally, we present a series of baseline experiments to study the effect of different observation configurations and generalization ability of RL agents.
△ Less
Submitted 12 July, 2022; v1 submitted 12 November, 2020;
originally announced November 2020.
-
AI and Wargaming
Authors:
James Goodman,
Sebastian Risi,
Simon Lucas
Abstract:
Recent progress in Game AI has demonstrated that given enough data from human gameplay, or experience gained via simulations, machines can rival or surpass the most skilled human players in classic games such as Go, or commercial computer games such as Starcraft. We review the current state-of-the-art through the lens of wargaming, and ask firstly what features of wargames distinguish them from th…
▽ More
Recent progress in Game AI has demonstrated that given enough data from human gameplay, or experience gained via simulations, machines can rival or surpass the most skilled human players in classic games such as Go, or commercial computer games such as Starcraft. We review the current state-of-the-art through the lens of wargaming, and ask firstly what features of wargames distinguish them from the usual AI testbeds, and secondly which recent AI advances are best suited to address these wargame-specific features.
△ Less
Submitted 25 September, 2020; v1 submitted 18 September, 2020;
originally announced September 2020.
-
Cross-Platform Games in Kotlin
Authors:
Simon M. Lucas
Abstract:
This demo paper describes a simple and practical approach to writing cross-platform casual games using the Kotlin programming language. A key aim is to make it much easier for researchers to demonstrate their AI playing a range of games. Pure Kotlin code (which excludes using any Java graphics libraries) can be transpiled to JavaScript and run in a web browser. However, writing Kotlin code that wi…
▽ More
This demo paper describes a simple and practical approach to writing cross-platform casual games using the Kotlin programming language. A key aim is to make it much easier for researchers to demonstrate their AI playing a range of games. Pure Kotlin code (which excludes using any Java graphics libraries) can be transpiled to JavaScript and run in a web browser. However, writing Kotlin code that will run without modification both in a web browser and on the JVM is not trivial; it requires strict adherence to an appropriate methodology. The contribution of this paper is to provide such a method including a software design and to demonstrate this working for Tetris, played either by AI or human.
△ Less
Submitted 10 August, 2020;
originally announced August 2020.
-
Modulation of viability signals for self-regulatory control
Authors:
Alvaro Ovalle,
Simon M. Lucas
Abstract:
We revisit the role of instrumental value as a driver of adaptive behavior. In active inference, instrumental or extrinsic value is quantified by the information-theoretic surprisal of a set of observations measuring the extent to which those observations conform to prior beliefs or preferences. That is, an agent is expected to seek the type of evidence that is consistent with its own model of the…
▽ More
We revisit the role of instrumental value as a driver of adaptive behavior. In active inference, instrumental or extrinsic value is quantified by the information-theoretic surprisal of a set of observations measuring the extent to which those observations conform to prior beliefs or preferences. That is, an agent is expected to seek the type of evidence that is consistent with its own model of the world. For reinforcement learning tasks, the distribution of preferences replaces the notion of reward. We explore a scenario in which the agent learns this distribution in a self-supervised manner. In particular, we highlight the distinction between observations induced by the environment and those pertaining more directly to the continuity of an agent in time. We evaluate our methodology in a dynamic environment with discrete time and actions. First with a surprisal minimizing model-free agent (in the RL sense) and then expanding to the model-based case to minimize the expected free energy.
△ Less
Submitted 13 October, 2020; v1 submitted 17 July, 2020;
originally announced July 2020.
-
Map** the Future of Particle Radiobiology in Europe: The INSPIRE Project
Authors:
N. T. Henthorn,
O. Sokol,
M. Durante,
L. De Marzi,
F. Pouzoulet,
J. Miszczyk,
P. Olko,
S. Brandenburg,
M-J. van Goethem,
L. Barazzuol,
M. Tambas,
J. A. Langendijk,
M. Davidkova,
V. Vondravcek,
E. Bodenstein,
J. Pawelke,
A. Lomax,
D. C. Weber,
A. Dasu,
B. Stenerlow,
P. R. Poulsen,
B. S. Sorensen,
C. Grau,
M. K. Sitarz,
A-C Heuskin
, et al. (5 additional authors not shown)
Abstract:
Particle therapy is a growing cancer treatment modality worldwide. However, there still remains a number of unanswered questions considering differences in the biological response between particles and photons. These questions, and probing of biological mechanisms in general, necessitate experimental investigation. The Infrastructure in Proton International Research (INSPIRE) project was created t…
▽ More
Particle therapy is a growing cancer treatment modality worldwide. However, there still remains a number of unanswered questions considering differences in the biological response between particles and photons. These questions, and probing of biological mechanisms in general, necessitate experimental investigation. The Infrastructure in Proton International Research (INSPIRE) project was created to provide an infrastructure for European research, unify research efforts on the topic of proton and ion therapy across Europe, and to facilitate the sharing of information and resources. This work highlights the radiobiological capabilities of the INSPIRE partners, providing details of physics (available particle types and energies), biology (sample preparation and post-irradiation analysis), and researcher access (the process of applying for beam time). The collection of information reported here is designed to provide researchers both in Europe and worldwide with the tools required to select the optimal center for their research needs. We also highlight areas of redundancy in capabilities and suggest areas for future investment.
△ Less
Submitted 7 July, 2020;
originally announced July 2020.
-
Does it matter how well I know what you're thinking? Opponent Modelling in an RTS game
Authors:
James Goodman,
Simon Lucas
Abstract:
Opponent Modelling tries to predict the future actions of opponents, and is required to perform well in multi-player games. There is a deep literature on learning an opponent model, but much less on how accurate such models must be to be useful. We investigate the sensitivity of Monte Carlo Tree Search (MCTS) and a Rolling Horizon Evolutionary Algorithm (RHEA) to the accuracy of their modelling of…
▽ More
Opponent Modelling tries to predict the future actions of opponents, and is required to perform well in multi-player games. There is a deep literature on learning an opponent model, but much less on how accurate such models must be to be useful. We investigate the sensitivity of Monte Carlo Tree Search (MCTS) and a Rolling Horizon Evolutionary Algorithm (RHEA) to the accuracy of their modelling of the opponent in a simple Real-Time Strategy game. We find that in this domain RHEA is much more sensitive to the accuracy of an opponent model than MCTS. MCTS generally does better even with an inaccurate model, while this will degrade RHEA's performance. We show that faced with an unknown opponent and a low computational budget it is better not to use any explicit model with RHEA, and to model the opponent's actions within the tree as part of the MCTS algorithm.
△ Less
Submitted 15 June, 2020;
originally announced June 2020.
-
Rinascimento: using event-value functions for playing Splendor
Authors:
Ivan Bravi,
Simon Lucas
Abstract:
In the realm of games research, Artificial General Intelligence algorithms often use score as main reward signal for learning or playing actions. However this has shown its severe limitations when the point rewards are very rare or absent until the end of the game. This paper proposes a new approach based on event logging: the game state triggers an event every time one of its features changes. Th…
▽ More
In the realm of games research, Artificial General Intelligence algorithms often use score as main reward signal for learning or playing actions. However this has shown its severe limitations when the point rewards are very rare or absent until the end of the game. This paper proposes a new approach based on event logging: the game state triggers an event every time one of its features changes. These events are processed by an Event-value Function (EF) that assigns a value to a single action or a sequence. The experiments have shown that such approach can mitigate the problem of scarce point rewards and improve the AI performance. Furthermore this represents a step forward in controlling the strategy adopted by the artificial agent, by describing a much richer and controllable behavioural space through the EF. Tuned EF are able to neatly synthesise the relevance of the events in the game. Agents using an EF show more robust when playing games with several opponents.
△ Less
Submitted 10 June, 2020;
originally announced June 2020.
-
Evaluating Generalisation in General Video Game Playing
Authors:
Martin Balla,
Simon M. Lucas,
Diego Perez-Liebana
Abstract:
The General Video Game Artificial Intelligence (GVGAI) competition has been running for several years with various tracks. This paper focuses on the challenge of the GVGAI learning track in which 3 games are selected and 2 levels are given for training, while 3 hidden levels are left for evaluation. This setup poses a difficult challenge for current Reinforcement Learning (RL) algorithms, as they…
▽ More
The General Video Game Artificial Intelligence (GVGAI) competition has been running for several years with various tracks. This paper focuses on the challenge of the GVGAI learning track in which 3 games are selected and 2 levels are given for training, while 3 hidden levels are left for evaluation. This setup poses a difficult challenge for current Reinforcement Learning (RL) algorithms, as they typically require much more data. This work investigates 3 versions of the Advantage Actor-Critic (A2C) algorithm trained on a maximum of 2 levels from the available 5 from the GVGAI framework and compares their performance on all levels. The selected sub-set of games have different characteristics, like stochasticity, reward distribution and objectives. We found that stochasticity improves the generalisation, but too much can cause the algorithms to fail to learn the training levels. The quality of the training levels also matters, different sets of training levels can boost generalisation over all levels. In the GVGAI competition agents are scored based on their win rates and then their scores achieved in the games. We found that solely using the rewards provided by the game might not encourage winning.
△ Less
Submitted 22 May, 2020;
originally announced May 2020.
-
Bootstrapped model learning and error correction for planning with uncertainty in model-based RL
Authors:
Alvaro Ovalle,
Simon M. Lucas
Abstract:
Having access to a forward model enables the use of planning algorithms such as Monte Carlo Tree Search and Rolling Horizon Evolution. Where a model is unavailable, a natural aim is to learn a model that reflects accurately the dynamics of the environment. In many situations it might not be possible and minimal glitches in the model may lead to poor performance and failure. This paper explores the…
▽ More
Having access to a forward model enables the use of planning algorithms such as Monte Carlo Tree Search and Rolling Horizon Evolution. Where a model is unavailable, a natural aim is to learn a model that reflects accurately the dynamics of the environment. In many situations it might not be possible and minimal glitches in the model may lead to poor performance and failure. This paper explores the problem of model misspecification through uncertainty-aware reinforcement learning agents. We propose a bootstrapped multi-headed neural network that learns the distribution of future states and rewards. We experiment with a number of schemes to extract the most likely predictions. Moreover, we also introduce a global error correction filter that applies high-level constraints guided by the context provided through the predictive distribution. We illustrate our approach on Minipacman. The evaluation demonstrates that when dealing with imperfect models, our methods exhibit increased performance and stability, both in terms of model accuracy and in its use within a planning algorithm.
△ Less
Submitted 15 April, 2020;
originally announced April 2020.
-
Interactive Evolution and Exploration Within Latent Level-Design Space of Generative Adversarial Networks
Authors:
Jacob Schrum,
Jake Gutierrez,
Vanessa Volz,
Jialin Liu,
Simon Lucas,
Sebastian Risi
Abstract:
Generative Adversarial Networks (GANs) are an emerging form of indirect encoding. The GAN is trained to induce a latent space on training data, and a real-valued evolutionary algorithm can search that latent space. Such Latent Variable Evolution (LVE) has recently been applied to game levels. However, it is hard for objective scores to capture level features that are appealing to players. Therefor…
▽ More
Generative Adversarial Networks (GANs) are an emerging form of indirect encoding. The GAN is trained to induce a latent space on training data, and a real-valued evolutionary algorithm can search that latent space. Such Latent Variable Evolution (LVE) has recently been applied to game levels. However, it is hard for objective scores to capture level features that are appealing to players. Therefore, this paper introduces a tool for interactive LVE of tile-based levels for games. The tool also allows for direct exploration of the latent dimensions, and allows users to play discovered levels. The tool works for a variety of GAN models trained for both Super Mario Bros. and The Legend of Zelda, and is easily generalizable to other games. A user study shows that both the evolution and latent space exploration features are appreciated, with a slight preference for direct exploration, but combining these features allows users to discover even better levels. User feedback also indicates how this system could eventually grow into a commercial design tool, with the addition of a few enhancements.
△ Less
Submitted 31 March, 2020;
originally announced April 2020.
-
Enhanced Rolling Horizon Evolution Algorithm with Opponent Model Learning: Results for the Fighting Game AI Competition
Authors:
Zhentao Tang,
Yuanheng Zhu,
Dongbin Zhao,
Simon M. Lucas
Abstract:
The Fighting Game AI Competition (FTGAIC) provides a challenging benchmark for 2-player video game AI. The challenge arises from the large action space, diverse styles of characters and abilities, and the real-time nature of the game. In this paper, we propose a novel algorithm that combines Rolling Horizon Evolution Algorithm (RHEA) with opponent model learning. The approach is readily applicable…
▽ More
The Fighting Game AI Competition (FTGAIC) provides a challenging benchmark for 2-player video game AI. The challenge arises from the large action space, diverse styles of characters and abilities, and the real-time nature of the game. In this paper, we propose a novel algorithm that combines Rolling Horizon Evolution Algorithm (RHEA) with opponent model learning. The approach is readily applicable to any 2-player video game. In contrast to conventional RHEA, an opponent model is proposed and is optimized by supervised learning with cross-entropy and reinforcement learning with policy gradient and Q-learning respectively, based on history observations from opponent. The model is learned during the live gameplay. With the learned opponent model, the extended RHEA is able to make more realistic plans based on what the opponent is likely to do. This tends to lead to better results. We compared our approach directly with the bots from the FTGAIC 2018 competition, and found our method to significantly outperform all of them, for all three character. Furthermore, our proposed bot with the policy-gradient-based opponent model is the only one without using Monte-Carlo Tree Search (MCTS) among top five bots in the 2019 competition in which it achieved second place, while using much less domain knowledge than the winner.
△ Less
Submitted 31 March, 2020;
originally announced March 2020.
-
Rolling Horizon Evolutionary Algorithms for General Video Game Playing
Authors:
Raluca D. Gaina,
Sam Devlin,
Simon M. Lucas,
Diego Perez-Liebana
Abstract:
Game-playing Evolutionary Algorithms, specifically Rolling Horizon Evolutionary Algorithms, have recently managed to beat the state of the art in win rate across many video games. However, the best results in a game are highly dependent on the specific configuration of modifications and hybrids introduced over several papers, each adding additional parameters to the core algorithm. Further, the be…
▽ More
Game-playing Evolutionary Algorithms, specifically Rolling Horizon Evolutionary Algorithms, have recently managed to beat the state of the art in win rate across many video games. However, the best results in a game are highly dependent on the specific configuration of modifications and hybrids introduced over several papers, each adding additional parameters to the core algorithm. Further, the best previously published parameters have been found from only a few human-picked combinations, as the possibility space has grown beyond exhaustive search. This paper presents the state of the art in Rolling Horizon Evolutionary Algorithms, combining all modifications described in literature, as well as new ones, for a large resultant hybrid. We then use a parameter optimiser, the N-Tuple Bandit Evolutionary Algorithm, to find the best combination of parameters in 20 games from the General Video Game AI Framework. Further, we analyse the algorithm's parameters and some interesting combinations revealed through the optimisation process. Lastly, we find new state of the art solutions on several games by automatically exploring the large parameter space of RHEA.
△ Less
Submitted 24 August, 2020; v1 submitted 27 March, 2020;
originally announced March 2020.
-
Neural Game Engine: Accurate learning of generalizable forward models from pixels
Authors:
Chris Bamford,
Simon Lucas
Abstract:
Access to a fast and easily copied forward model of a game is essential for model-based reinforcement learning and for algorithms such as Monte Carlo tree search, and is also beneficial as a source of unlimited experience data for model-free algorithms. Learning forward models is an interesting and important challenge in order to address problems where a model is not available. Building upon previ…
▽ More
Access to a fast and easily copied forward model of a game is essential for model-based reinforcement learning and for algorithms such as Monte Carlo tree search, and is also beneficial as a source of unlimited experience data for model-free algorithms. Learning forward models is an interesting and important challenge in order to address problems where a model is not available. Building upon previous work on the Neural GPU, this paper introduces the Neural Game Engine, as a way to learn models directly from pixels. The learned models are able to generalise to different size game levels to the ones they were trained on without loss of accuracy. Results on 10 deterministic General Video Game AI games demonstrate competitive performance, with many of the games models being learned perfectly both in terms of pixel predictions and reward predictions. The pre-trained models are available through the OpenAI Gym interface and are available publicly for future research here: \url{https://github.com/Bam4d/Neural-Game-Engine}
△ Less
Submitted 31 March, 2020; v1 submitted 23 March, 2020;
originally announced March 2020.
-
Weighting NTBEA for Game AI Optimisation
Authors:
James Goodman,
Simon Lucas
Abstract:
The N-Tuple Bandit Evolutionary Algorithm (NTBEA) has proven very effective in optimising algorithm parameters in Game AI. A potential weakness is the use of a simple average of all component Tuples in the model. This study investigates a refinement to the N-Tuple model used in NTBEA by weighting these component Tuples by their level of information and specificity of match. We introduce weighting…
▽ More
The N-Tuple Bandit Evolutionary Algorithm (NTBEA) has proven very effective in optimising algorithm parameters in Game AI. A potential weakness is the use of a simple average of all component Tuples in the model. This study investigates a refinement to the N-Tuple model used in NTBEA by weighting these component Tuples by their level of information and specificity of match. We introduce weighting functions to the model to obtain Weighted- NTBEA and test this on four benchmark functions and two game environments. These tests show that vanilla NTBEA is the most reliable and performant of the algorithms tested. Furthermore we show that given an iteration budget it is better to execute several independent NTBEA runs, and use part of the budget to find the best recommendation from these runs.
△ Less
Submitted 1 April, 2020; v1 submitted 23 March, 2020;
originally announced March 2020.
-
Learning Local Forward Models on Unforgiving Games
Authors:
Alexander Dockhorn,
Simon M. Lucas,
Vanessa Volz,
Ivan Bravi,
Raluca D. Gaina,
Diego Perez-Liebana
Abstract:
This paper examines learning approaches for forward models based on local cell transition functions. We provide a formal definition of local forward models for which we propose two basic learning approaches. Our analysis is based on the game Sokoban, where a wrong action can lead to an unsolvable game state. Therefore, an accurate prediction of an action's resulting state is necessary to avoid thi…
▽ More
This paper examines learning approaches for forward models based on local cell transition functions. We provide a formal definition of local forward models for which we propose two basic learning approaches. Our analysis is based on the game Sokoban, where a wrong action can lead to an unsolvable game state. Therefore, an accurate prediction of an action's resulting state is necessary to avoid this scenario.
In contrast to learning the complete state transition function, local forward models allow extracting multiple training examples from a single state transition. In this way, the Hash Set model, as well as the Decision Tree model, quickly learn to predict upcoming state transitions of both the training and the test set. Applying the model using a statistical forward planner showed that the best models can be used to satisfying degree even in cases in which the test levels have not yet been seen.
Our evaluation includes an analysis of various local neighbourhood patterns and sizes to test the learners' capabilities in case too few or too many attributes are extracted, of which the latter has shown do degrade the performance of the model learner.
△ Less
Submitted 1 September, 2019;
originally announced September 2019.
-
Project Thyia: A Forever Gameplayer
Authors:
Raluca D. Gaina,
Simon M. Lucas,
Diego Perez-Liebana
Abstract:
The space of Artificial Intelligence entities is dominated by conversational bots. Some of them fit in our pockets and we take them everywhere we go, or allow them to be a part of human homes. Siri, Alexa, they are recognised as present in our world. But a lot of games research is restricted to existing in the separate realm of software. We enter different worlds when playing games, but those worl…
▽ More
The space of Artificial Intelligence entities is dominated by conversational bots. Some of them fit in our pockets and we take them everywhere we go, or allow them to be a part of human homes. Siri, Alexa, they are recognised as present in our world. But a lot of games research is restricted to existing in the separate realm of software. We enter different worlds when playing games, but those worlds cease to exist once we quit. Similarly, AI game-players are run once on a game (or maybe for longer periods of time, in the case of learning algorithms which need some, still limited, period for training), and they cease to exist once the game ends. But what if they didn't? What if there existed artificial game-players that continuously played games, learned from their experiences and kept getting better? What if they interacted with the real world and us, humans: live-streaming games, chatting with viewers, accepting suggestions for strategies or games to play, forming opinions on popular game titles? In this paper, we introduce the vision behind a new project called Thyia, which focuses around creating a present, continuous, `always-on', interactive game-player.
△ Less
Submitted 10 June, 2019;
originally announced June 2019.
-
Foundations of Digital Archæoludology
Authors:
Cameron Browne,
Dennis J. N. J. Soemers,
Éric Piette,
Matthew Stephenson,
Michael Conrad,
Walter Crist,
Thierry Depaulis,
Eddie Duggan,
Fred Horn,
Steven Kelk,
Simon M. Lucas,
João Pedro Neto,
David Parlett,
Abdallah Saffidine,
Ulrich Schädler,
Jorge Nuno Silva,
Alex de Voogt,
Mark H. M. Winands
Abstract:
Digital Archaeoludology (DAL) is a new field of study involving the analysis and reconstruction of ancient games from incomplete descriptions and archaeological evidence using modern computational techniques. The aim is to provide digital tools and methods to help game historians and other researchers better understand traditional games, their development throughout recorded human history, and the…
▽ More
Digital Archaeoludology (DAL) is a new field of study involving the analysis and reconstruction of ancient games from incomplete descriptions and archaeological evidence using modern computational techniques. The aim is to provide digital tools and methods to help game historians and other researchers better understand traditional games, their development throughout recorded human history, and their relationship to the development of human culture and mathematical knowledge. This work is being explored in the ERC-funded Digital Ludeme Project.
The aim of this inaugural international research meeting on DAL is to gather together leading experts in relevant disciplines - computer science, artificial intelligence, machine learning, computational phylogenetics, mathematics, history, archaeology, anthropology, etc. - to discuss the key themes and establish the foundations for this new field of research, so that it may continue beyond the lifetime of its initiating project.
△ Less
Submitted 31 May, 2019;
originally announced May 2019.
-
Tile Pattern KL-Divergence for Analysing and Evolving Game Levels
Authors:
Simon M. Lucas,
Vanessa Volz
Abstract:
This paper provides a detailed investigation of using the Kullback-Leibler (KL) Divergence as a way to compare and analyse game-levels, and hence to use the measure as the objective function of an evolutionary algorithm to evolve new levels. We describe the benefits of its asymmetry for level analysis and demonstrate how (not surprisingly) the quality of the results depends on the features used. H…
▽ More
This paper provides a detailed investigation of using the Kullback-Leibler (KL) Divergence as a way to compare and analyse game-levels, and hence to use the measure as the objective function of an evolutionary algorithm to evolve new levels. We describe the benefits of its asymmetry for level analysis and demonstrate how (not surprisingly) the quality of the results depends on the features used. Here we use tile-patterns of various sizes as features.
When using the measure for evolution-based level generation, we demonstrate that the choice of variation operator is critical in order to provide an efficient search process, and introduce a novel convolutional mutation operator to facilitate this. We compare the results with alternative generators, including evolving in the latent space of generative adversarial networks, and Wave Function Collapse. The results clearly show the proposed method to provide competitive performance, providing reasonable quality results with very fast training and reasonably fast generation.
△ Less
Submitted 24 April, 2019;
originally announced May 2019.
-
Mek: Mechanics Prototy** Tool for 2D Tile-Based Turn-Based Deterministic Games
Authors:
Rokas Volkovas,
Michael Fairbank,
John Woodward,
Simon Lucas
Abstract:
There are few digital tools to help designers create game mechanics. A general language to express game mechanics is necessary for rapid game design iteration. The first iteration of a mechanics-focused language, together with its interfacing tool, are introduced in this paper. The language is restricted to two-dimensional, turn-based, tile-based, deterministic, complete-information games. The too…
▽ More
There are few digital tools to help designers create game mechanics. A general language to express game mechanics is necessary for rapid game design iteration. The first iteration of a mechanics-focused language, together with its interfacing tool, are introduced in this paper. The language is restricted to two-dimensional, turn-based, tile-based, deterministic, complete-information games. The tool is compared to the existing alternatives for game mechanics prototy** and shown to be capable of succinctly implementing a range of well-known game mechanics.
△ Less
Submitted 6 April, 2019;
originally announced April 2019.
-
Rinascimento: Optimising Statistical Forward Planning Agents for Playing Splendor
Authors:
Ivan Bravi,
Simon Lucas,
Diego Perez-Liebana,
Jialin Liu
Abstract:
Game-based benchmarks have been playing an essential role in the development of Artificial Intelligence (AI) techniques. Providing diverse challenges is crucial to push research toward innovation and understanding in modern techniques. Rinascimento provides a parameterised partially-observable multiplayer card-based board game, these parameters can easily modify the rules, objectives and items in…
▽ More
Game-based benchmarks have been playing an essential role in the development of Artificial Intelligence (AI) techniques. Providing diverse challenges is crucial to push research toward innovation and understanding in modern techniques. Rinascimento provides a parameterised partially-observable multiplayer card-based board game, these parameters can easily modify the rules, objectives and items in the game. We describe the framework in all its features and the game-playing challenge providing baseline game-playing AIs and analysis of their skills. We reserve to agents' hyper-parameter tuning a central role in the experiments highlighting how it can heavily influence the performance. The base-line agents contain several additional contribution to Statistical Forward Planning algorithms.
△ Less
Submitted 3 April, 2019;
originally announced April 2019.
-
A Local Approach to Forward Model Learning: Results on the Game of Life Game
Authors:
Simon M. Lucas,
Alexander Dockhorn,
Vanessa Volz,
Chris Bamford,
Raluca D. Gaina,
Ivan Bravi,
Diego Perez-Liebana,
Sanaz Mostaghim,
Rudolf Kruse
Abstract:
This paper investigates the effect of learning a forward model on the performance of a statistical forward planning agent. We transform Conway's Game of Life simulation into a single-player game where the objective can be either to preserve as much life as possible or to extinguish all life as quickly as possible.
In order to learn the forward model of the game, we formulate the problem in a nov…
▽ More
This paper investigates the effect of learning a forward model on the performance of a statistical forward planning agent. We transform Conway's Game of Life simulation into a single-player game where the objective can be either to preserve as much life as possible or to extinguish all life as quickly as possible.
In order to learn the forward model of the game, we formulate the problem in a novel way that learns the local cell transition function by creating a set of supervised training data and predicting the next state of each cell in the grid based on its current state and immediate neighbours. Using this method we are able to harvest sufficient data to learn perfect forward models by observing only a few complete state transitions, using either a look-up table, a decision tree or a neural network.
In contrast, learning the complete state transition function is a much harder task and our initial efforts to do this using deep convolutional auto-encoders were less successful.
We also investigate the effects of imperfect learned models on prediction errors and game-playing performance, and show that even models with significant errors can provide good performance.
△ Less
Submitted 29 March, 2019;
originally announced March 2019.
-
The Interdependence of Hierarchical Institutions: Federal Regulation, Job Creation, and the Moderating Effect of State Economic Freedom
Authors:
David S. Lucas,
Christopher J. Boudreaux
Abstract:
Regulation is commonly viewed as a hindrance to entrepreneurship, but heterogeneity in the effects of regulation is rarely explored. We focus on regional variation in the effects of national-level regulations by develo** a theory of hierarchical institutional interdependence. Using the political science theory of market-preserving federalism, we argue that regional economic freedom attenuates th…
▽ More
Regulation is commonly viewed as a hindrance to entrepreneurship, but heterogeneity in the effects of regulation is rarely explored. We focus on regional variation in the effects of national-level regulations by develo** a theory of hierarchical institutional interdependence. Using the political science theory of market-preserving federalism, we argue that regional economic freedom attenuates the negative influence of national regulation on net job creation. Using U.S. data, we find that regulation destroys jobs on net, but regional economic freedom moderates this effect. In regions with average economic freedom, a one percent increase in regulation results in 14 fewer jobs created on net. However, a standard deviation increase in economic freedom attenuates this relationship by four fewer jobs. Interestingly, this moderation accrues strictly to older firms; regulation usually harms young firm job creation, and economic freedom does not attenuate this relationship.
△ Less
Submitted 7 March, 2019;
originally announced March 2019.
-
Efficient Evolutionary Methods for Game Agent Optimisation: Model-Based is Best
Authors:
Simon M. Lucas,
Jialin Liu,
Ivan Bravi,
Raluca D. Gaina,
John Woodward,
Vanessa Volz,
Diego Perez-Liebana
Abstract:
This paper introduces a simple and fast variant of Planet Wars as a test-bed for statistical planning based Game AI agents, and for noisy hyper-parameter optimisation. Planet Wars is a real-time strategy game with simple rules but complex game-play. The variant introduced in this paper is designed for speed to enable efficient experimentation, and also for a fixed action space to enable practical…
▽ More
This paper introduces a simple and fast variant of Planet Wars as a test-bed for statistical planning based Game AI agents, and for noisy hyper-parameter optimisation. Planet Wars is a real-time strategy game with simple rules but complex game-play. The variant introduced in this paper is designed for speed to enable efficient experimentation, and also for a fixed action space to enable practical inter-operability with General Video Game AI agents. If we treat the game as a win-loss game (which is standard), then this leads to challenging noisy optimisation problems both in tuning agents to play the game, and in tuning game parameters. Here we focus on the problem of tuning an agent, and report results using the recently developed N-Tuple Bandit Evolutionary Algorithm and a number of other optimisers, including Sequential Model-based Algorithm Configuration (SMAC). Results indicate that the N-Tuple Bandit Evolutionary offers competitive performance as well as insight into the effects of combinations of parameter choices.
△ Less
Submitted 3 January, 2019;
originally announced January 2019.
-
Proving Program Properties as First-Order Satisfiability
Authors:
Salvador Lucas
Abstract:
Program semantics can often be expressed as a (many-sorted) first-order theory S, and program properties as sentences $\varphi$ which are intended to hold in the canonical model of such a theory, which is often incomputable. Recently, we have shown that properties $\varphi$ expressed as the existential closure of a boolean combination of atoms can be disproved by just finding a model of S and the…
▽ More
Program semantics can often be expressed as a (many-sorted) first-order theory S, and program properties as sentences $\varphi$ which are intended to hold in the canonical model of such a theory, which is often incomputable. Recently, we have shown that properties $\varphi$ expressed as the existential closure of a boolean combination of atoms can be disproved by just finding a model of S and the negation $\neg\varphi$ of $\varphi$. Furthermore, this idea works quite well in practice due to the existence of powerful tools for the automatic generation of models for (many-sorted) first-order theories. In this paper we extend our previous result to arbitrary properties, expressed as sentences without any special restriction. Consequently, one can prove a program property $\varphi$ by just finding a model of an appropriate theory (including S and possibly something else) and an appropriate first-order formula related to $\varphi$. Beyond its possible theoretical interest, we show that our results can also be of practical use in several respects.
△ Less
Submitted 30 November, 2018; v1 submitted 13 August, 2018;
originally announced August 2018.
-
Game AI Research with Fast Planet Wars Variants
Authors:
Simon M. Lucas
Abstract:
This paper describes a new implementation of Planet Wars, designed from the outset for Game AI research. The skill-depth of the game makes it a challenge for game-playing agents, and the speed of more than 1 million game ticks per second enables rapid experimentation and prototy**. The parameterised nature of the game together with an interchangeable actuator model make it well suited to automat…
▽ More
This paper describes a new implementation of Planet Wars, designed from the outset for Game AI research. The skill-depth of the game makes it a challenge for game-playing agents, and the speed of more than 1 million game ticks per second enables rapid experimentation and prototy**. The parameterised nature of the game together with an interchangeable actuator model make it well suited to automated game tuning. The game is designed to be fun to play for humans, and is directly playable by General Video Game AI agents.
△ Less
Submitted 22 June, 2018;
originally announced June 2018.
-
Shallow decision-making analysis in General Video Game Playing
Authors:
Ivan Bravi,
Jialin Liu,
Diego Perez-Liebana,
Simon Lucas
Abstract:
The General Video Game AI competitions have been the testing ground for several techniques for game playing, such as evolutionary computation techniques, tree search algorithms, hyper heuristic based or knowledge based algorithms. So far the metrics used to evaluate the performance of agents have been win ratio, game score and length of games. In this paper we provide a wider set of metrics and a…
▽ More
The General Video Game AI competitions have been the testing ground for several techniques for game playing, such as evolutionary computation techniques, tree search algorithms, hyper heuristic based or knowledge based algorithms. So far the metrics used to evaluate the performance of agents have been win ratio, game score and length of games. In this paper we provide a wider set of metrics and a comparison method for evaluating and comparing agents. The metrics and the comparison method give shallow introspection into the agent's decision making process and they can be applied to any agent regardless of its algorithmic nature. In this work, the metrics and the comparison method are used to measure the impact of the terms that compose a tree policy of an MCTS based agent, comparing with several baseline agents. The results clearly show how promising such general approach is and how it can be useful to understand the behaviour of an AI agent, in particular, how the comparison with baseline agents can help understanding the shape of the agent decision landscape. The presented metrics and comparison method represent a step toward to more descriptive ways of logging and analysing agent's behaviours.
△ Less
Submitted 4 June, 2018;
originally announced June 2018.
-
Evolving Mario Levels in the Latent Space of a Deep Convolutional Generative Adversarial Network
Authors:
Vanessa Volz,
Jacob Schrum,
Jialin Liu,
Simon M. Lucas,
Adam Smith,
Sebastian Risi
Abstract:
Generative Adversarial Networks (GANs) are a machine learning approach capable of generating novel example outputs across a space of provided training examples. Procedural Content Generation (PCG) of levels for video games could benefit from such models, especially for games where there is a pre-existing corpus of levels to emulate. This paper trains a GAN to generate levels for Super Mario Bros u…
▽ More
Generative Adversarial Networks (GANs) are a machine learning approach capable of generating novel example outputs across a space of provided training examples. Procedural Content Generation (PCG) of levels for video games could benefit from such models, especially for games where there is a pre-existing corpus of levels to emulate. This paper trains a GAN to generate levels for Super Mario Bros using a level from the Video Game Level Corpus. The approach successfully generates a variety of levels similar to one in the original corpus, but is further improved by application of the Covariance Matrix Adaptation Evolution Strategy (CMA-ES). Specifically, various fitness functions are used to discover levels within the latent space of the GAN that maximize desired properties. Simple static properties are optimized, such as a given distribution of tile types. Additionally, the champion A* agent from the 2009 Mario AI competition is used to assess whether a level is playable, and how many jum** actions are required to beat it. These fitness functions allow for the discovery of levels that exist within the space of examples designed by experts, and also guide the search towards levels that fulfill one or more specified objectives.
△ Less
Submitted 2 May, 2018;
originally announced May 2018.
-
General Video Game AI: a Multi-Track Framework for Evaluating Agents, Games and Content Generation Algorithms
Authors:
Diego Perez-Liebana,
Jialin Liu,
Ahmed Khalifa,
Raluca D. Gaina,
Julian Togelius,
Simon M. Lucas
Abstract:
General Video Game Playing (GVGP) aims at designing an agent that is capable of playing multiple video games with no human intervention. In 2014, The General Video Game AI (GVGAI) competition framework was created and released with the purpose of providing researchers a common open-source and easy to use platform for testing their AI methods with potentially infinity of games created using Video G…
▽ More
General Video Game Playing (GVGP) aims at designing an agent that is capable of playing multiple video games with no human intervention. In 2014, The General Video Game AI (GVGAI) competition framework was created and released with the purpose of providing researchers a common open-source and easy to use platform for testing their AI methods with potentially infinity of games created using Video Game Description Language (VGDL). The framework has been expanded into several tracks during the last few years to meet the demand of different research directions. The agents are required either to play multiple unknown games with or without access to game simulations, or to design new game levels or rules. This survey paper presents the VGDL, the GVGAI framework, existing tracks, and reviews the wide use of GVGAI framework in research, education and competitions five years after its birth. A future plan of framework improvements is also described.
△ Less
Submitted 22 February, 2019; v1 submitted 28 February, 2018;
originally announced February 2018.
-
The N-Tuple Bandit Evolutionary Algorithm for Game Agent Optimisation
Authors:
Simon M Lucas,
Jialin Liu,
Diego Perez-Liebana
Abstract:
This paper describes the N-Tuple Bandit Evolutionary Algorithm (NTBEA), an optimisation algorithm developed for noisy and expensive discrete (combinatorial) optimisation problems. The algorithm is applied to two game-based hyper-parameter optimisation problems. The N-Tuple system directly models the statistics, approximating the fitness and number of evaluations of each modelled combination of par…
▽ More
This paper describes the N-Tuple Bandit Evolutionary Algorithm (NTBEA), an optimisation algorithm developed for noisy and expensive discrete (combinatorial) optimisation problems. The algorithm is applied to two game-based hyper-parameter optimisation problems. The N-Tuple system directly models the statistics, approximating the fitness and number of evaluations of each modelled combination of parameters. The model is simple, efficient and informative. Results show that the NTBEA significantly outperforms grid search and an estimation of distribution algorithm.
△ Less
Submitted 8 May, 2018; v1 submitted 16 February, 2018;
originally announced February 2018.
-
Higher physical fitness levels are associated with less language decline in healthy ageing
Authors:
K. Segaert,
S. J. E. Lucas,
C. V. Burley,
Pieter Segaert,
A. E. Milner,
M. Ryan,
L. Wheeldon
Abstract:
Healthy ageing is associated with decline in cognitive abilities such as language. Aerobic fitness has been shown to ameliorate decline in some cognitive domains, but the potential benefits for language have not been examined. In a cross-sectional sample, we investigated the relationship between aerobic fitness and tip-of-the-tongue states. These are among the most frequent cognitive failures in h…
▽ More
Healthy ageing is associated with decline in cognitive abilities such as language. Aerobic fitness has been shown to ameliorate decline in some cognitive domains, but the potential benefits for language have not been examined. In a cross-sectional sample, we investigated the relationship between aerobic fitness and tip-of-the-tongue states. These are among the most frequent cognitive failures in healthy older adults and occur when a speaker knows a word but is unable to produce it. We found that healthy older adults indeed experience more tip-of-the-tongue states than young adults. Importantly, higher aerobic fitness levels decrease the probability of experiencing tip-of-the-tongue states in healthy older adults. Fitness-related differences in word finding abilities are observed over and above effects of age. This is the first demonstration of a link between aerobic fitness and language functioning in healthy older adults.
△ Less
Submitted 12 April, 2018; v1 submitted 4 January, 2018;
originally announced January 2018.
-
Stratigraphy of Aeolis Dorsa, Mars: stratigraphic context of the great river deposits
Authors:
Edwin S. Kite,
Alan D. Howard,
Antoine S. Lucas,
John C. Armstrong,
Oded Aharonson,
Michael P. Lamb
Abstract:
Unraveling the stratigraphic record is the key to understanding ancient climate and past climate changes on Mars. River deposits when placed in stratigraphic order could constrain the number, magnitudes, and durations of the wettest climates in Mars history. We establish the stratigraphic context of river deposits in Aeolis Dorsa sedimentary basin, 10E of Gale crater. Here, wind has exhumed a stra…
▽ More
Unraveling the stratigraphic record is the key to understanding ancient climate and past climate changes on Mars. River deposits when placed in stratigraphic order could constrain the number, magnitudes, and durations of the wettest climates in Mars history. We establish the stratigraphic context of river deposits in Aeolis Dorsa sedimentary basin, 10E of Gale crater. Here, wind has exhumed a stratigraphic section of >=4 unconformity-bounded sedimentary rock packages, recording >=3 distinct episodes of surface runoff. Early deposits (>700m thick) are embayed by river deposits (>400m), which are in turn unconformably draped by fan-shaped deposits (<100m) which we interpret as alluvial fans. Yardang-forming deposits (>900 m) unconformably drape all previous deposits. River deposits embay a dissected sedimentary-rock landscape, and comprise >=2 distinguishable units. The total interval spanned by river deposits is >(1x10^6-2x10^7) yr; more if we include alluvial-fan deposits. Alluvial-fan deposits unconformably postdate thrust faults which crosscut river deposits. We infer a relatively dry interval of >4x10^7 yr after river deposits formed and before fan-shaped deposits formed. The time gap between the end of river deposition and the onset of yardang-forming deposits is constrained to >10^8 yr by the density of impact craters embedded at the unconformity. We correlate yardang-forming deposits to the upper layers of Gale crater's mound (Mt. Sharp/Aeolis Mons), and fan-shaped deposits to Peace Vallis fan. Alternations between periods of low vs. high mean obliquity may have modulated erosion-deposition cycling in Aeolis. This is consistent with results from an ensemble of simulations of Solar System orbital evolution and the resulting history of Mars obliquity. Almost all simulations yield intervals of continuously low mean Mars obliquity that are long enough to match our unconformity data.
△ Less
Submitted 9 December, 2017;
originally announced December 2017.
-
A Semantic Approach to the Analysis of Rewriting-Based Systems
Authors:
Salvador Lucas
Abstract:
Properties expressed as the provability of a first-order sentence can be disproved by just finding a model of the negation of the sentence. This fact, however, is meaningful in restricted cases only, depending on the shape of the sentence and the class of systems at stake. In this paper we show that a number of interesting properties of rewriting-based systems can be investigated in this way, incl…
▽ More
Properties expressed as the provability of a first-order sentence can be disproved by just finding a model of the negation of the sentence. This fact, however, is meaningful in restricted cases only, depending on the shape of the sentence and the class of systems at stake. In this paper we show that a number of interesting properties of rewriting-based systems can be investigated in this way, including infeasibility and non-joinability of critical pairs in (conditional) rewriting, non-loo**ness of conditional rewrite systems, or the secure access to protected pages of a web site modeled as an order-sorted rewrite theory. Interestingly, this uniform, semantic approach succeeds when specific techniques developed to deal with the aforementioned problems fail.
△ Less
Submitted 15 September, 2017;
originally announced September 2017.
-
Evolution of the partially frustrated magnetic order in CePd$_{1-x}$Ni$_x$Al
Authors:
Zita Huesges,
Stefan Lucas,
Sarah Wunderlich,
Fabiano Yokaichiya,
Karel Prokeš,
Karin Schmalzl,
Marie-Hélène Lemée-Cailleau,
Bjørn Pedersen,
Veronika Fritsch,
Hilbert v. Löhneysen,
Oliver Stockert
Abstract:
We report on a single-crystal neutron diffraction study of the evolution of the antiferromagnetic order in the heavy-fermion compound CePd$_{1-x}$Ni$_x$Al which exhibits partial geometric frustration due to its distorted Kagomé structure. The magnetic structure is found to be unchanged with a propagation vector $Q_\mathrm{AF} \approx (0.5~0~0.35)$ for all Ni concentrations $x$ up to…
▽ More
We report on a single-crystal neutron diffraction study of the evolution of the antiferromagnetic order in the heavy-fermion compound CePd$_{1-x}$Ni$_x$Al which exhibits partial geometric frustration due to its distorted Kagomé structure. The magnetic structure is found to be unchanged with a propagation vector $Q_\mathrm{AF} \approx (0.5~0~0.35)$ for all Ni concentrations $x$ up to $x_c \approx 0.14$. Upon approaching the quantum critical concentration $x_c$, the ordered moment vanishes linearly with Néel temperature $T_{\rm N}$, in good agreement with CePdAl under hydrostatic pressure. For all Ni concentrations, substantial short-range magnetic correlations are observed above $T_{\rm N}$ as a result of frustration.
△ Less
Submitted 8 September, 2017;
originally announced September 2017.
-
Efficient Noisy Optimisation with the Sliding Window Compact Genetic Algorithm
Authors:
Simon M. Lucas,
Jialin Liu,
Diego Pérez-Liébana
Abstract:
The compact genetic algorithm is an Estimation of Distribution Algorithm for binary optimisation problems. Unlike the standard Genetic Algorithm, no cross-over or mutation is involved. Instead, the compact Genetic Algorithm uses a virtual population represented as a probability distribution over the set of binary strings. At each optimisation iteration, exactly two individuals are generated by sam…
▽ More
The compact genetic algorithm is an Estimation of Distribution Algorithm for binary optimisation problems. Unlike the standard Genetic Algorithm, no cross-over or mutation is involved. Instead, the compact Genetic Algorithm uses a virtual population represented as a probability distribution over the set of binary strings. At each optimisation iteration, exactly two individuals are generated by sampling from the distribution, and compared exactly once to determine a winner and a loser. The probability distribution is then adjusted to increase the likelihood of generating individuals similar to the winner.
This paper introduces two straightforward variations of the compact Genetic Algorithm, each of which lead to a significant improvement in performance. The main idea is to make better use of each fitness evaluation, by ensuring that each evaluated individual is used in multiple win/loss comparisons. The first variation is to sample $n>2$ individuals at each iteration to make $n(n-1)/2$ comparisons. The second variation only samples one individual at each iteration but keeps a sliding history window of previous individuals to compare with. We evaluate methods on two noisy test problems and show that in each case they significantly outperform the compact Genetic Algorithm, while maintaining the simplicity of the algorithm.
△ Less
Submitted 7 August, 2017;
originally announced August 2017.
-
Evaluating Noisy Optimisation Algorithms: First Hitting Time is Problematic
Authors:
Simon M. Lucas,
Jialin Liu,
Diego Pérez-Liébana
Abstract:
A key part of any evolutionary algorithm is fitness evaluation. When fitness evaluations are corrupted by noise, as happens in many real-world problems as a consequence of various types of uncertainty, a strategy is needed in order to cope with this. Resampling is one of the most common strategies, whereby each solution is evaluated many times in order to reduce the variance of the fitness estimat…
▽ More
A key part of any evolutionary algorithm is fitness evaluation. When fitness evaluations are corrupted by noise, as happens in many real-world problems as a consequence of various types of uncertainty, a strategy is needed in order to cope with this. Resampling is one of the most common strategies, whereby each solution is evaluated many times in order to reduce the variance of the fitness estimates. When evaluating the performance of a noisy optimisation algorithm, a key consideration is the stop** condition for the algorithm. A frequently used stop** condition in runtime analysis, known as "First Hitting Time", is to stop the algorithm as soon as it encounters the optimal solution. However, this is unrealistic for real-world problems, as if the optimal solution were already known, there would be no need to search for it. This paper argues that the use of First Hitting Time, despite being a commonly used approach, is significantly flawed and overestimates the quality of many algorithms in real-world cases, where the optimum is not known in advance and has to be genuinely searched for. A better alternative is to measure the quality of the solution an algorithm returns after a fixed evaluation budget, i.e., to focus on final solution quality. This paper argues that focussing on final solution quality is more realistic and demonstrates cases where the results produced by each algorithm evaluation method lead to very different conclusions regarding the quality of each noisy optimisation algorithm.
△ Less
Submitted 12 July, 2017; v1 submitted 13 June, 2017;
originally announced June 2017.
-
The N-Tuple Bandit Evolutionary Algorithm for Automatic Game Improvement
Authors:
Kamolwan Kunanusont,
Raluca D. Gaina,
Jialin Liu,
Diego Perez-Liebana,
Simon M. Lucas
Abstract:
This paper describes a new evolutionary algorithm that is especially well suited to AI-Assisted Game Design. The approach adopted in this paper is to use observations of AI agents playing the game to estimate the game's quality. Some of best agents for this purpose are General Video Game AI agents, since they can be deployed directly on a new game without game-specific tuning; these agents tend to…
▽ More
This paper describes a new evolutionary algorithm that is especially well suited to AI-Assisted Game Design. The approach adopted in this paper is to use observations of AI agents playing the game to estimate the game's quality. Some of best agents for this purpose are General Video Game AI agents, since they can be deployed directly on a new game without game-specific tuning; these agents tend to be based on stochastic algorithms which give robust but noisy results and tend to be expensive to run. This motivates the main contribution of the paper: the development of the novel N-Tuple Bandit Evolutionary Algorithm, where a model is used to estimate the fitness of unsampled points and a bandit approach is used to balance exploration and exploitation of the search space. Initial results on optimising a Space Battle game variant suggest that the algorithm offers far more robust results than the Random Mutation Hill Climber and a Biased Mutation variant, which are themselves known to offer competitive performance across a range of problems. Subjective observations are also given by human players on the nature of the evolved games, which indicate a preference towards games generated by the N-Tuple algorithm.
△ Less
Submitted 18 March, 2017;
originally announced May 2017.
-
Analysis of Vanilla Rolling Horizon Evolution Parameters in General Video Game Playing
Authors:
Raluca D. Gaina,
Jialin Liu,
Simon M. Lucas,
Diego Perez-Liebana
Abstract:
Monte Carlo Tree Search techniques have generally dominated General Video Game Playing, but recent research has started looking at Evolutionary Algorithms and their potential at matching Tree Search level of play or even outperforming these methods. Online or Rolling Horizon Evolution is one of the options available to evolve sequences of actions for planning in General Video Game Playing, but no…
▽ More
Monte Carlo Tree Search techniques have generally dominated General Video Game Playing, but recent research has started looking at Evolutionary Algorithms and their potential at matching Tree Search level of play or even outperforming these methods. Online or Rolling Horizon Evolution is one of the options available to evolve sequences of actions for planning in General Video Game Playing, but no research has been done up to date that explores the capabilities of the vanilla version of this algorithm in multiple games. This study aims to critically analyse the different configurations regarding population size and individual length in a set of 20 games from the General Video Game AI corpus. Distinctions are made between deterministic and stochastic games, and the implications of using superior time budgets are studied. Results show that there is scope for the use of these techniques, which in some configurations outperform Monte Carlo Tree Search, and also suggest that further research in these methods could boost their performance.
△ Less
Submitted 24 April, 2017;
originally announced April 2017.
-
Evaluating and Modelling Hanabi-Playing Agents
Authors:
Joseph Walton-Rivers,
Piers R. Williams,
Richard Bartle,
Diego Perez-Liebana,
Simon M. Lucas
Abstract:
Agent modelling involves considering how other agents will behave, in order to influence your own actions. In this paper, we explore the use of agent modelling in the hidden-information, collaborative card game Hanabi. We implement a number of rule-based agents, both from the literature and of our own devising, in addition to an Information Set Monte Carlo Tree Search (IS-MCTS) agent. We observe p…
▽ More
Agent modelling involves considering how other agents will behave, in order to influence your own actions. In this paper, we explore the use of agent modelling in the hidden-information, collaborative card game Hanabi. We implement a number of rule-based agents, both from the literature and of our own devising, in addition to an Information Set Monte Carlo Tree Search (IS-MCTS) agent. We observe poor results from IS-MCTS, so construct a new, predictor version that uses a model of the agents with which it is paired. We observe a significant improvement in game-playing strength from this agent in comparison to IS-MCTS, resulting from its consideration of what the other agents in a game would do. In addition, we create a flawed rule-based agent to highlight the predictor's capabilities with such an agent.
△ Less
Submitted 24 April, 2017;
originally announced April 2017.