Search | arXiv e-print repository

EffoVPR: Effective Foundation Model Utilization for Visual Place Recognition

Authors: Issar Tzachor, Boaz Lerner, Matan Levy, Michael Green, Tal Berkovitz Shalev, Gavriel Habib, Dvir Samuel, Noam Korngut Zailer, Or Shimshi, Nir Darshan, Rami Ben-Ari

Abstract: The task of Visual Place Recognition (VPR) is to predict the location of a query image from a database of geo-tagged images. Recent studies in VPR have highlighted the significant advantage of employing pre-trained foundation models like DINOv2 for the VPR task. However, these models are often deemed inadequate for VPR without further fine-tuning on task-specific data. In this paper, we propose a… ▽ More The task of Visual Place Recognition (VPR) is to predict the location of a query image from a database of geo-tagged images. Recent studies in VPR have highlighted the significant advantage of employing pre-trained foundation models like DINOv2 for the VPR task. However, these models are often deemed inadequate for VPR without further fine-tuning on task-specific data. In this paper, we propose a simple yet powerful approach to better exploit the potential of a foundation model for VPR. We first demonstrate that features extracted from self-attention layers can serve as a powerful re-ranker for VPR. Utilizing these features in a zero-shot manner, our method surpasses previous zero-shot methods and achieves competitive results compared to supervised methods across multiple datasets. Subsequently, we demonstrate that a single-stage method leveraging internal ViT layers for pooling can generate global features that achieve state-of-the-art results, even when reduced to a dimensionality as low as 128D. Nevertheless, incorporating our local foundation features for re-ranking, expands this gap. Our approach further demonstrates remarkable robustness and generalization, achieving state-of-the-art results, with a significant gap, in challenging scenarios, involving occlusion, day-night variations, and seasonal changes. △ Less

Submitted 28 May, 2024; originally announced May 2024.

arXiv:2311.16172 [pdf, other]

Evolutionary Machine Learning and Games

Authors: Julian Togelius, Ahmed Khalifa, Sam Earle, Michael Cerny Green, Lisa Soros

Abstract: Evolutionary machine learning (EML) has been applied to games in multiple ways, and for multiple different purposes. Importantly, AI research in games is not only about playing games; it is also about generating game content, modeling players, and many other applications. Many of these applications pose interesting problems for EML. We will structure this chapter on EML for games based on whether… ▽ More Evolutionary machine learning (EML) has been applied to games in multiple ways, and for multiple different purposes. Importantly, AI research in games is not only about playing games; it is also about generating game content, modeling players, and many other applications. Many of these applications pose interesting problems for EML. We will structure this chapter on EML for games based on whether evolution is used to augment machine learning (ML) or ML is used to augment evolution. For completeness, we also briefly discuss the usage of ML and evolution separately in games. △ Less

Submitted 20 November, 2023; originally announced November 2023.

Comments: 27 pages, 5 figures, part of Evolutionary Machine Learning Book (https://link.springer.com/book/10.1007/978-981-99-3814-8)

arXiv:2311.10538 [pdf, other]

Testing Language Model Agents Safely in the Wild

Authors: Silen Naihin, David Atkinson, Marc Green, Merwane Hamadi, Craig Swift, Douglas Schonholtz, Adam Tauman Kalai, David Bau

Abstract: A prerequisite for safe autonomy-in-the-wild is safe testing-in-the-wild. Yet real-world autonomous tests face several unique safety challenges, both due to the possibility of causing harm during a test, as well as the risk of encountering new unsafe agent behavior through interactions with real-world and potentially malicious actors. We propose a framework for conducting safe autonomous agent tes… ▽ More A prerequisite for safe autonomy-in-the-wild is safe testing-in-the-wild. Yet real-world autonomous tests face several unique safety challenges, both due to the possibility of causing harm during a test, as well as the risk of encountering new unsafe agent behavior through interactions with real-world and potentially malicious actors. We propose a framework for conducting safe autonomous agent tests on the open internet: agent actions are audited by a context-sensitive monitor that enforces a stringent safety boundary to stop an unsafe test, with suspect behavior ranked and logged to be examined by humans. We design a basic safety monitor (AgentMonitor) that is flexible enough to monitor existing LLM agents, and, using an adversarial simulated agent, we measure its ability to identify and stop unsafe situations. Then we apply the AgentMonitor on a battery of real-world tests of AutoGPT, and we identify several limitations and challenges that will face the creation of safe in-the-wild tests as autonomous agents grow more capable. △ Less

Submitted 3 December, 2023; v1 submitted 17 November, 2023; originally announced November 2023.

arXiv:2308.10856 [pdf, other]

Majorana Demonstrator Data Release for AI/ML Applications

Authors: I. J. Arnquist, F. T. Avignone III, A. S. Barabash, C. J. Barton, K. H. Bhimani, E. Blalock, B. Bos, M. Busch, M. Buuck, T. S. Caldwell, Y. -D. Chan, C. D. Christofferson, P. -H. Chu, M. L. Clark, C. Cuesta, J. A. Detwiler, Yu. Efremenko, H. Ejiri, S. R. Elliott, N. Fuad, G. K. Giovanetti, M. P. Green, J. Gruszko, I. S. Guinn, V. E. Guiseppe , et al. (35 additional authors not shown)

Abstract: The enclosed data release consists of a subset of the calibration data from the Majorana Demonstrator experiment. Each Majorana event is accompanied by raw Germanium detector waveforms, pulse shape discrimination cuts, and calibrated final energies, all shared in an HDF5 file format along with relevant metadata. This release is specifically designed to support the training and testing of Artificia… ▽ More The enclosed data release consists of a subset of the calibration data from the Majorana Demonstrator experiment. Each Majorana event is accompanied by raw Germanium detector waveforms, pulse shape discrimination cuts, and calibrated final energies, all shared in an HDF5 file format along with relevant metadata. This release is specifically designed to support the training and testing of Artificial Intelligence (AI) and Machine Learning (ML) algorithms upon our data. This document is structured as follows. Section I provides an overview of the dataset's content and format; Section II outlines the location of this dataset and the method for accessing it; Section III presents the NPML Machine Learning Challenge associated with this dataset; Section IV contains a disclaimer from the Majorana collaboration regarding the use of this dataset; Appendix A contains technical details of this data release. Please direct questions about the material provided within this release to [email protected] (A. Li). △ Less

Submitted 14 September, 2023; v1 submitted 21 August, 2023; originally announced August 2023.

Comments: DataPlanet Access: https://dataplanet.ucsd.edu/dataset.xhtml?persistentId=perma:83.ucsddata/UQWQAV

arXiv:2306.05633 [pdf, other]

McFIL: Model Counting Functionality-Inherent Leakage

Authors: Maximilian Zinkus, Yinzhi Cao, Matthew Green

Abstract: Protecting the confidentiality of private data and using it for useful collaboration have long been at odds. Modern cryptography is bridging this gap through rapid growth in secure protocols such as multi-party computation, fully-homomorphic encryption, and zero-knowledge proofs. However, even with provable indistinguishability or zero-knowledgeness, confidentiality loss from leakage inherent to t… ▽ More Protecting the confidentiality of private data and using it for useful collaboration have long been at odds. Modern cryptography is bridging this gap through rapid growth in secure protocols such as multi-party computation, fully-homomorphic encryption, and zero-knowledge proofs. However, even with provable indistinguishability or zero-knowledgeness, confidentiality loss from leakage inherent to the functionality may partially or even completely compromise secret values without ever falsifying proofs of security. In this work, we describe McFIL, an algorithmic approach and accompanying software implementation which automatically quantifies intrinsic leakage for a given functionality. Extending and generalizing the Chosen-Ciphertext attack framework of Beck et al. with a practical heuristic, our approach not only quantifies but maximizes functionality-inherent leakage using Maximum Model Counting within a SAT solver. As a result, McFIL automatically derives approximately-optimal adversary inputs that, when used in secure protocols, maximize information leakage of private values. △ Less

Submitted 8 June, 2023; originally announced June 2023.

Comments: To appear in USENIX Security 2023

arXiv:2302.05817 [pdf, other]

doi 10.1145/3582437.3587211

Level Generation Through Large Language Models

Authors: Graham Todd, Sam Earle, Muhammad Umair Nasir, Michael Cerny Green, Julian Togelius

Abstract: Large Language Models (LLMs) are powerful tools, capable of leveraging their training on natural language to write stories, generate code, and answer questions. But can they generate functional video game levels? Game levels, with their complex functional constraints and spatial relationships in more than one dimension, are very different from the kinds of data an LLM typically sees during trainin… ▽ More Large Language Models (LLMs) are powerful tools, capable of leveraging their training on natural language to write stories, generate code, and answer questions. But can they generate functional video game levels? Game levels, with their complex functional constraints and spatial relationships in more than one dimension, are very different from the kinds of data an LLM typically sees during training. Datasets of game levels are also hard to come by, potentially taxing the abilities of these data-hungry models. We investigate the use of LLMs to generate levels for the game Sokoban, finding that LLMs are indeed capable of doing so, and that their performance scales dramatically with dataset size. We also perform preliminary experiments on controlling LLM level generators and discuss promising areas for future work. △ Less

Submitted 1 June, 2023; v1 submitted 11 February, 2023; originally announced February 2023.

Journal ref: FDG 2023: Proceedings of the 18th International Conference on the Foundations of Digital Games

arXiv:2211.03927 [pdf, other]

Automatic Error Detection in Integrated Circuits Image Segmentation: A Data-driven Approach

Authors: Zhikang Zhang, Bruno Machado Trindade, Michael Green, Zifan Yu, Christopher Pawlowicz, Fengbo Ren

Abstract: Due to the complicated nanoscale structures of current integrated circuits(IC) builds and low error tolerance of IC image segmentation tasks, most existing automated IC image segmentation approaches require human experts for visual inspection to ensure correctness, which is one of the major bottlenecks in large-scale industrial applications. In this paper, we present the first data-driven automati… ▽ More Due to the complicated nanoscale structures of current integrated circuits(IC) builds and low error tolerance of IC image segmentation tasks, most existing automated IC image segmentation approaches require human experts for visual inspection to ensure correctness, which is one of the major bottlenecks in large-scale industrial applications. In this paper, we present the first data-driven automatic error detection approach targeting two types of IC segmentation errors: wire errors and via errors. On an IC image dataset collected from real industry, we demonstrate that, by adapting existing CNN-based approaches of image classification and image translation with additional pre-processing and post-processing techniques, we are able to achieve recall/precision of 0.92/0.93 in wire error detection and 0.96/0.90 in via error detection, respectively. △ Less

Submitted 7 November, 2022; originally announced November 2022.

arXiv:2209.06168 [pdf, other]

Borch: A Deep Universal Probabilistic Programming Language

Authors: Lewis Belcher, Johan Gudmundsson, Michael Green

Abstract: Ever since the Multilayered Perceptron was first introduced the connectionist community has struggled with the concept of uncertainty and how this could be represented in these types of models. This past decade has seen a lot of effort in trying to join the principled approach of probabilistic modeling with the scalable nature of deep neural networks. While the theoretical benefits of this consoli… ▽ More Ever since the Multilayered Perceptron was first introduced the connectionist community has struggled with the concept of uncertainty and how this could be represented in these types of models. This past decade has seen a lot of effort in trying to join the principled approach of probabilistic modeling with the scalable nature of deep neural networks. While the theoretical benefits of this consolidation are clear, there are also several important practical aspects of these endeavors; namely to force the models we create to represent, learn, and report uncertainty in every prediction that is made. Many of these efforts have been based on extending existing frameworks with additional structures. We present Borch, a scalable deep universal probabilistic programming language, built on top of PyTorch. The code is available for download and use in our repository https://gitlab.com/desupervised/borch. △ Less

Submitted 13 September, 2022; originally announced September 2022.

arXiv:2207.10710 [pdf, other]

doi 10.1103/PhysRevC.107.014321

Interpretable Boosted Decision Tree Analysis for the Majorana Demonstrator

Authors: I. J. Arnquist, F. T. Avignone III, A. S. Barabash, C. J. Barton, K. H. Bhimani, E. Blalock, B. Bos, M. Busch, M. Buuck, T. S. Caldwell, Y -D. Chan, C. D. Christofferson, P. -H. Chu, M. L. Clark, C. Cuesta, J. A. Detwiler, Yu. Efremenko, S. R. Elliott, G. K. Giovanetti, M. P. Green, J. Gruszko, I. S. Guinn, V. E. Guiseppe, C. R. Haufe, R. Henning , et al. (30 additional authors not shown)

Abstract: The Majorana Demonstrator is a leading experiment searching for neutrinoless double-beta decay with high purity germanium detectors (HPGe). Machine learning provides a new way to maximize the amount of information provided by these detectors, but the data-driven nature makes it less interpretable compared to traditional analysis. An interpretability study reveals the machine's decision-making logi… ▽ More The Majorana Demonstrator is a leading experiment searching for neutrinoless double-beta decay with high purity germanium detectors (HPGe). Machine learning provides a new way to maximize the amount of information provided by these detectors, but the data-driven nature makes it less interpretable compared to traditional analysis. An interpretability study reveals the machine's decision-making logic, allowing us to learn from the machine to feedback to the traditional analysis. In this work, we have presented the first machine learning analysis of the data from the Majorana Demonstrator; this is also the first interpretable machine learning analysis of any germanium detector experiment. Two gradient boosted decision tree models are trained to learn from the data, and a game-theory-based model interpretability study is conducted to understand the origin of the classification power. By learning from data, this analysis recognizes the correlations among reconstruction parameters to further enhance the background rejection performance. By learning from the machine, this analysis reveals the importance of new background categories to reciprocally benefit the standard Majorana analysis. This model is highly compatible with next-generation germanium detector experiments like LEGEND since it can be simultaneously trained on a large number of detectors. △ Less

Submitted 15 February, 2023; v1 submitted 21 July, 2022; originally announced July 2022.

Comments: 13 pages, 9 figures

arXiv:2206.13623 [pdf, other]

Learning Controllable 3D Level Generators

Authors: Zehua Jiang, Sam Earle, Michael Cerny Green, Julian Togelius

Abstract: Procedural Content Generation via Reinforcement Learning (PCGRL) foregoes the need for large human-authored data-sets and allows agents to train explicitly on functional constraints, using computable, user-defined measures of quality instead of target output. We explore the application of PCGRL to 3D domains, in which content-generation tasks naturally have greater complexity and potential pertine… ▽ More Procedural Content Generation via Reinforcement Learning (PCGRL) foregoes the need for large human-authored data-sets and allows agents to train explicitly on functional constraints, using computable, user-defined measures of quality instead of target output. We explore the application of PCGRL to 3D domains, in which content-generation tasks naturally have greater complexity and potential pertinence to real-world applications. Here, we introduce several PCGRL tasks for the 3D domain, Minecraft (Mojang Studios, 2009). These tasks will challenge RL-based generators using affordances often found in 3D environments, such as jum**, multiple dimensional movement, and gravity. We train an agent to optimize each of these tasks to explore the capabilities of previous research in PCGRL. This agent is able to generate relatively complex and diverse levels, and generalize to random initial states and control targets. Controllability tests in the presented tasks demonstrate their utility to analyze success and failure for 3D generators. △ Less

Submitted 14 August, 2022; v1 submitted 27 June, 2022; originally announced June 2022.

Comments: 8 pages, 9 figures

arXiv:2206.05497 [pdf, other]

Mutation Models: Learning to Generate Levels by Imitating Evolution

Authors: Ahmed Khalifa, Michael Cerny Green, Julian Togelius

Abstract: Search-based procedural content generation (PCG) is a well-known method for level generation in games. Its key advantage is that it is generic and able to satisfy functional constraints. However, due to the heavy computational costs to run these algorithms online, search-based PCG is rarely utilized for real-time generation. In this paper, we introduce mutation models, a new type of iterative leve… ▽ More Search-based procedural content generation (PCG) is a well-known method for level generation in games. Its key advantage is that it is generic and able to satisfy functional constraints. However, due to the heavy computational costs to run these algorithms online, search-based PCG is rarely utilized for real-time generation. In this paper, we introduce mutation models, a new type of iterative level generator based on machine learning. We train a model to imitate the evolutionary process and use the trained model to generate levels. This trained model is able to modify noisy levels sequentially to create better levels without the need for a fitness function during inference. We evaluate our trained models on a 2D maze generation task. We compare several different versions of the method: training the models either at the end of evolution (normal evolution) or every 100 generations (assisted evolution) and using the model as a mutation function during evolution. Using the assisted evolution process, the final trained models are able to generate mazes with a success rate of 99% and high diversity of 86%. The trained model is many times faster than the evolutionary process it was trained on. This work opens the door to a new way of learning level generators guided by an evolutionary process, meaning automatic creation of generators with specifiable constraints and objectives that are fast enough for runtime deployment in games. △ Less

Submitted 25 August, 2022; v1 submitted 11 June, 2022; originally announced June 2022.

Comments: 8 pages, 6 figures, and 2 tables. Published at PCGWorkshop 2022 at FDG 2022

arXiv:2206.01326 [pdf, other]

Improving Fairness in Large-Scale Object Recognition by CrowdSourced Demographic Information

Authors: Zu Kim, André Araujo, Bingyi Cao, Cam Askew, Jack Sim, Mike Green, N'Mah Fodiatu Yilla, Tobias Weyand

Abstract: There has been increasing awareness of ethical issues in machine learning, and fairness has become an important research topic. Most fairness efforts in computer vision have been focused on human sensing applications and preventing discrimination by people's physical attributes such as race, skin color or age by increasing visual representation for particular demographic groups. We argue that ML f… ▽ More There has been increasing awareness of ethical issues in machine learning, and fairness has become an important research topic. Most fairness efforts in computer vision have been focused on human sensing applications and preventing discrimination by people's physical attributes such as race, skin color or age by increasing visual representation for particular demographic groups. We argue that ML fairness efforts should extend to object recognition as well. Buildings, artwork, food and clothing are examples of the objects that define human culture. Representing these objects fairly in machine learning datasets will lead to models that are less biased towards a particular culture and more inclusive of different traditions and values. There exist many research datasets for object recognition, but they have not carefully considered which classes should be included, or how much training data should be collected per class. To address this, we propose a simple and general approach, based on crowdsourcing the demographic composition of the contributors: we define fair relevance scores, estimate them, and assign them to each class. We showcase its application to the landmark recognition domain, presenting a detailed analysis and the final fairer landmark rankings. We present analysis which leads to a much fairer coverage of the world compared to existing datasets. The evaluation dataset was used for the 2021 Google Landmark Challenges, which was the first of a kind with an emphasis on fairness in generic object recognition. △ Less

Submitted 2 June, 2022; originally announced June 2022.

arXiv:2205.09073 [pdf, other]

Dialog Inpainting: Turning Documents into Dialogs

Authors: Zhuyun Dai, Arun Tejasvi Chaganty, Vincent Zhao, Aida Amini, Qazi Mamunur Rashid, Mike Green, Kelvin Guu

Abstract: Many important questions (e.g. "How to eat healthier?") require conversation to establish context and explore in depth. However, conversational question answering (ConvQA) systems have long been stymied by scarce training data that is expensive to collect. To address this problem, we propose a new technique for synthetically generating diverse and high-quality dialog data: dialog inpainting. Our a… ▽ More Many important questions (e.g. "How to eat healthier?") require conversation to establish context and explore in depth. However, conversational question answering (ConvQA) systems have long been stymied by scarce training data that is expensive to collect. To address this problem, we propose a new technique for synthetically generating diverse and high-quality dialog data: dialog inpainting. Our approach takes the text of any document and transforms it into a two-person dialog between the writer and an imagined reader: we treat sentences from the article as utterances spoken by the writer, and then use a dialog inpainter to predict what the imagined reader asked or said in between each of the writer's utterances. By applying this approach to passages from Wikipedia and the web, we produce WikiDialog and WebDialog, two datasets totalling 19 million diverse information-seeking dialogs -- 1,000x larger than the largest existing ConvQA dataset. Furthermore, human raters judge the answer adequacy and conversationality of WikiDialog to be as good or better than existing manually-collected datasets. Using our inpainted data to pre-train ConvQA retrieval systems, we significantly advance state-of-the-art across three benchmarks (QReCC, OR-QuAC, TREC CAsT) yielding up to 40% relative gains on standard evaluation metrics. △ Less

Submitted 31 May, 2022; v1 submitted 18 May, 2022; originally announced May 2022.

arXiv:2204.05217 [pdf, other]

Persona-driven Dominant/Submissive Map (PDSM) Generation for Tutorials

Authors: Michael Cerny Green, Ahmed Khalifa, M Charity, Julian Togelius

Abstract: In this paper, we present a method for automated persona-driven video game tutorial level generation. Tutorial levels are scenarios in which the player can explore and discover different rules and game mechanics. Procedural personas can guide generators to create content which encourages or discourages certain playstyle behaviors. In this system, we use procedural personas to calculate the behavio… ▽ More In this paper, we present a method for automated persona-driven video game tutorial level generation. Tutorial levels are scenarios in which the player can explore and discover different rules and game mechanics. Procedural personas can guide generators to create content which encourages or discourages certain playstyle behaviors. In this system, we use procedural personas to calculate the behavioral characteristics of levels which are evolved using the quality-diversity algorithm known as Constrained MAP-Elites. An evolved map's quality is determined by its simplicity: the simpler it is, the better it is. Within this work, we show that the generated maps can strongly encourage or discourage different persona-like behaviors and range from simple solutions to complex puzzle-levels, making them perfect candidates for a tutorial generative system. △ Less

Submitted 11 April, 2022; originally announced April 2022.

Comments: 10 pages, 7 figures, 2 tables

arXiv:2203.13351 [pdf, other]

Predicting Personas Using Mechanic Frequencies and Game State Traces

Authors: Michael Cerny Green, Ahmed Khalifa, M Charity, Debosmita Bhaumik, Julian Togelius

Abstract: We investigate how to efficiently predict play personas based on playtraces. Play personas can be computed by calculating the action agreement ratio between a player and a generative model of playing behavior, a so-called procedural persona. But this is computationally expensive and assumes that appropriate procedural personas are readily available. We present two methods for estimating player per… ▽ More We investigate how to efficiently predict play personas based on playtraces. Play personas can be computed by calculating the action agreement ratio between a player and a generative model of playing behavior, a so-called procedural persona. But this is computationally expensive and assumes that appropriate procedural personas are readily available. We present two methods for estimating player persona, one using regular supervised learning and aggregate measures of game mechanics initiated, and another based on sequence learning on a trace of closely cropped gameplay observations. While both of these methods achieve high accuracy when predicting play personas defined by agreement with procedural personas, they utterly fail to predict play style as defined by the players themselves using a questionnaire. This interesting result highlights the value of using computational methods in defining play personas. △ Less

Submitted 15 June, 2022; v1 submitted 24 March, 2022; originally announced March 2022.

Comments: 8 pages, 3 tables, 2 figures

arXiv:2201.05177 [pdf]

Making a (Counterfactual) Difference One Rationale at a Time

Authors: Mitchell Plyler, Michael Green, Min Chi

Abstract: Rationales, snippets of extracted text that explain an inference, have emerged as a popular framework for interpretable natural language processing (NLP). Rationale models typically consist of two cooperating modules: a selector and a classifier with the goal of maximizing the mutual information (MMI) between the "selected" text and the document label. Despite their promises, MMI-based methods oft… ▽ More Rationales, snippets of extracted text that explain an inference, have emerged as a popular framework for interpretable natural language processing (NLP). Rationale models typically consist of two cooperating modules: a selector and a classifier with the goal of maximizing the mutual information (MMI) between the "selected" text and the document label. Despite their promises, MMI-based methods often pick up on spurious text patterns and result in models with nonsensical behaviors. In this work, we investigate whether counterfactual data augmentation (CDA), without human assistance, can improve the performance of the selector by lowering the mutual information between spurious signals and the document label. Our counterfactuals are produced in an unsupervised fashion using class-dependent generative models. From an information theoretic lens, we derive properties of the unaugmented dataset for which our CDA approach would succeed. The effectiveness of CDA is empirically evaluated by comparing against several baselines including an improved MMI-based rationale schema on two multi aspect datasets. Our results show that CDA produces rationales that better capture the signal of interest. △ Less

Submitted 13 January, 2022; originally announced January 2022.

Journal ref: Advances in Neural Information Processing Systems 2021

arXiv:2109.11007 [pdf, other]

SoK: Cryptographic Confidentiality of Data on Mobile Devices

Authors: Maximilian Zinkus, Tushar M. Jois, Matthew Green

Abstract: Mobile devices have become an indispensable component of modern life. Their high storage capacity gives these devices the capability to store vast amounts of sensitive personal data, which makes them a high-value target: these devices are routinely stolen by criminals for data theft, and are increasingly viewed by law enforcement agencies as a valuable source of forensic data. Over the past severa… ▽ More Mobile devices have become an indispensable component of modern life. Their high storage capacity gives these devices the capability to store vast amounts of sensitive personal data, which makes them a high-value target: these devices are routinely stolen by criminals for data theft, and are increasingly viewed by law enforcement agencies as a valuable source of forensic data. Over the past several years, providers have deployed a number of advanced cryptographic features intended to protect data on mobile devices, even in the strong setting where an attacker has physical access to a device. Many of these techniques draw from the research literature, but have been adapted to this entirely new problem setting. This involves a number of novel challenges, which are incompletely addressed in the literature. In this work, we outline those challenges, and systematize the known approaches to securing user data against extraction attacks. Our work proposes a methodology that researchers can use to analyze cryptographic data confidentiality for mobile devices. We evaluate the existing literature for securing devices against data extraction adversaries with powerful capabilities including access to devices and to the cloud services they rely on. We then analyze existing mobile device confidentiality measures to identify research areas that have not received proper attention from the community and represent opportunities for future research. △ Less

Submitted 22 September, 2021; originally announced September 2021.

Comments: Proceedings on Privacy Enhancing Technologies Symposium

arXiv:2108.08874 [pdf, other]

Towards A Fairer Landmark Recognition Dataset

Authors: Zu Kim, André Araujo, Bingyi Cao, Cam Askew, Jack Sim, Mike Green, N'Mah Fodiatu Yilla, Tobias Weyand

Abstract: We introduce a new landmark recognition dataset, which is created with a focus on fair worldwide representation. While previous work proposes to collect as many images as possible from web repositories, we instead argue that such approaches can lead to biased data. To create a more comprehensive and equitable dataset, we start by defining the fair relevance of a landmark to the world population. T… ▽ More We introduce a new landmark recognition dataset, which is created with a focus on fair worldwide representation. While previous work proposes to collect as many images as possible from web repositories, we instead argue that such approaches can lead to biased data. To create a more comprehensive and equitable dataset, we start by defining the fair relevance of a landmark to the world population. These relevances are estimated by combining anonymized Google Maps user contribution statistics with the contributors' demographic information. We present a stratification approach and analysis which leads to a much fairer coverage of the world, compared to existing datasets. The resulting datasets are used to evaluate computer vision models as part of the the Google Landmark Recognition and RetrievalChallenges 2021. △ Less

Submitted 6 June, 2022; v1 submitted 19 August, 2021; originally announced August 2021.

Comments: Please cite the full detailed version of the paper instead: Improving Fairness in Large-Scale Object Recognition by CrowdSourced Demographic Information arXiv:2206.01326

arXiv:2108.02955 [pdf, other]

Impressions of the GDMC AI Settlement Generation Challenge in Minecraft

Authors: Christoph Salge, Claus Aranha, Adrian Brightmoore, Sean Butler, Rodrigo Canaan, Michael Cook, Michael Cerny Green, Hagen Fischer, Christian Guckelsberger, Jupiter Hadley, Jean-Baptiste Hervé, Mark R Johnson, Quinn Kybartas, David Mason, Mike Preuss, Tristan Smith, Ruck Thawonmas, Julian Togelius

Abstract: The GDMC AI settlement generation challenge is a PCG competition about producing an algorithm that can create an "interesting" Minecraft settlement for a given map. This paper contains a collection of written experiences with this competition, by participants, judges, organizers and advisors. We asked people to reflect both on the artifacts themselves, and on the competition in general. The aim of… ▽ More The GDMC AI settlement generation challenge is a PCG competition about producing an algorithm that can create an "interesting" Minecraft settlement for a given map. This paper contains a collection of written experiences with this competition, by participants, judges, organizers and advisors. We asked people to reflect both on the artifacts themselves, and on the competition in general. The aim of this paper is to offer a shareable and edited collection of experiences and qualitative feedback - which seem to contain a lot of insights on PCG and computational creativity, but would otherwise be lost once the output of the competition is reduced to scalar performance values. We reflect upon some organizational issues for AI competitions, and discuss the future of the GDMC competition. △ Less

Submitted 6 August, 2021; originally announced August 2021.

Comments: 28 pages, 5 figures

arXiv:2105.12613 [pdf, other]

Data Security on Mobile Devices: Current State of the Art, Open Problems, and Proposed Solutions

Authors: Maximilian Zinkus, Tushar M. Jois, Matthew Green

Abstract: In this work we present definitive evidence, analysis, and (where needed) speculation to answer the questions, (1) Which concrete security measures in mobile devices meaningfully prevent unauthorized access to user data? (2) In what ways are modern mobile devices accessed by unauthorized parties? (3) How can we improve modern mobile devices to prevent unauthorized access? We examine the two majo… ▽ More In this work we present definitive evidence, analysis, and (where needed) speculation to answer the questions, (1) Which concrete security measures in mobile devices meaningfully prevent unauthorized access to user data? (2) In what ways are modern mobile devices accessed by unauthorized parties? (3) How can we improve modern mobile devices to prevent unauthorized access? We examine the two major platforms in the mobile space, iOS and Android, and for each we provide a thorough investigation of existing and historical security features, evidence-based discussion of known security bypass techniques, and concrete recommendations for remediation. We then aggregate and analyze public records, documentation, articles, and blog postings to categorize and discuss unauthorized bypass of security features by hackers and law enforcement alike. We provide in-depth analysis of the data potentially accessed via law enforcement methodologies from both mobile devices and associated cloud services. Our fact-gathering and analysis allow us to make a number of recommendations for improving data security on these devices. The mitigations we propose can be largely summarized as increasing coverage of sensitive data via strong encryption, but we detail various challenges and approaches towards this goal and others. It is our hope that this work stimulates mobile device development and research towards security and privacy, provides a unique reference of information, and acts as an evidence-based argument for the importance of reliable encryption to privacy, which we believe is both a human right and integral to a functioning democracy. △ Less

Submitted 26 May, 2021; originally announced May 2021.

Comments: Please see https://securephones.io/ for the project's website

arXiv:2105.08550 [pdf, other]

Federated Learning With Highly Imbalanced Audio Data

Authors: Marc C. Green, Mark D. Plumbley

Abstract: Federated learning (FL) is a privacy-preserving machine learning method that has been proposed to allow training of models using data from many different clients, without these clients having to transfer all their data to a central server. There has as yet been relatively little consideration of FL or other privacy-preserving methods in audio. In this paper, we investigate using FL for a sound eve… ▽ More Federated learning (FL) is a privacy-preserving machine learning method that has been proposed to allow training of models using data from many different clients, without these clients having to transfer all their data to a central server. There has as yet been relatively little consideration of FL or other privacy-preserving methods in audio. In this paper, we investigate using FL for a sound event detection task using audio from the FSD50K dataset. Audio is split into clients based on uploader metadata. This results in highly imbalanced subsets of data between clients, noted as a key issue in FL scenarios. A series of models is trained using `high-volume' clients that contribute 100 audio clips or more, testing the effects of varying FL parameters, followed by an additional model trained using all clients with no minimum audio contribution. It is shown that FL models trained using the high-volume clients can perform similarly to a centrally-trained model, though there is much more noise in results than would typically be expected for a centrally-trained model. The FL model trained using all clients has a considerably reduced performance compared to the centrally-trained model. △ Less

Submitted 18 May, 2021; originally announced May 2021.

arXiv:2105.07898 [pdf, other]

Physics-informed attention-based neural network for solving non-linear partial differential equations

Authors: Ruben Rodriguez-Torrado, Pablo Ruiz, Luis Cueto-Felgueroso, Michael Cerny Green, Tyler Friesen, Sebastien Matringe, Julian Togelius

Abstract: Physics-Informed Neural Networks (PINNs) have enabled significant improvements in modelling physical processes described by partial differential equations (PDEs). PINNs are based on simple architectures, and learn the behavior of complex physical systems by optimizing the network parameters to minimize the residual of the underlying PDE. Current network architectures share some of the limitations… ▽ More Physics-Informed Neural Networks (PINNs) have enabled significant improvements in modelling physical processes described by partial differential equations (PDEs). PINNs are based on simple architectures, and learn the behavior of complex physical systems by optimizing the network parameters to minimize the residual of the underlying PDE. Current network architectures share some of the limitations of classical numerical discretization schemes when applied to non-linear differential equations in continuum mechanics. A paradigmatic example is the solution of hyperbolic conservation laws that develop highly localized nonlinear shock waves. Learning solutions of PDEs with dominant hyperbolic character is a challenge for current PINN approaches, which rely, like most grid-based numerical schemes, on adding artificial dissipation. Here, we address the fundamental question of which network architectures are best suited to learn the complex behavior of non-linear PDEs. We focus on network architecture rather than on residual regularization. Our new methodology, called Physics-Informed Attention-based Neural Networks, (PIANNs), is a combination of recurrent neural networks and attention mechanisms. The attention mechanism adapts the behavior of the deep neural network to the non-linear features of the solution, and break the current limitations of PINNs. We find that PIANNs effectively capture the shock front in a hyperbolic model problem, and are capable of providing high-quality solutions inside and beyond the training set. △ Less

Submitted 17 May, 2021; originally announced May 2021.

arXiv:2105.04342 [pdf, other]

Exploring open-ended gameplay features with Micro RollerCoaster Tycoon

Authors: Michael Cerny Green, Victoria Yen, Sam Earle, Dipika Rajesh, Maria Edwards, L. B. Soros

Abstract: This paper introduces MicroRCT, a novel open source simulator inspired by the theme park sandbox game RollerCoaster Tycoon. The goal in MicroRCT is to place rides and shops in an amusement park to maximize profit earned from park guests. Thus, the challenges for game AI include both selecting high-earning attractions and placing them in locations that are convenient to guests. In this paper, the M… ▽ More This paper introduces MicroRCT, a novel open source simulator inspired by the theme park sandbox game RollerCoaster Tycoon. The goal in MicroRCT is to place rides and shops in an amusement park to maximize profit earned from park guests. Thus, the challenges for game AI include both selecting high-earning attractions and placing them in locations that are convenient to guests. In this paper, the MAP-Elites algorithm is used to generate a diversity of park layouts, exploring two theoretical questions about evolutionary algorithms and game design: 1) Is there a benefit to starting from a minimal starting point for evolution and complexifying incrementally? and 2) What are the effects of resource limitations on creativity and optimization? Results indicate that building from scratch with no costs results in the widest diversity of high-performing designs. △ Less

Submitted 10 May, 2021; originally announced May 2021.

Comments: 8 pages, 10 figures, submitted to Foundations of Digital Games Conference 2021

arXiv:2104.12516 [pdf]

Evaluating the performance of personal, social, health-related, biomarker and genetic data for predicting an individuals future health using machine learning: A longitudinal analysis

Authors: Mark Green

Abstract: As we gain access to a greater depth and range of health-related information about individuals, three questions arise: (1) Can we build better models to predict individual-level risk of ill health? (2) How much data do we need to effectively predict ill health? (3) Are new methods required to process the added complexity that new forms of data bring? The aim of the study is to apply a machine lear… ▽ More As we gain access to a greater depth and range of health-related information about individuals, three questions arise: (1) Can we build better models to predict individual-level risk of ill health? (2) How much data do we need to effectively predict ill health? (3) Are new methods required to process the added complexity that new forms of data bring? The aim of the study is to apply a machine learning approach to identify the relative contribution of personal, social, health-related, biomarker and genetic data as predictors of future health in individuals. Using longitudinal data from 6830 individuals in the UK from Understanding Society (2010-12 to 2015-17), the study compares the predictive performance of five types of measures: personal (e.g. age, sex), social (e.g. occupation, education), health-related (e.g. body weight, grip strength), biomarker (e.g. cholesterol, hormones) and genetic single nucleotide polymorphisms (SNPs). The predicted outcome variable was limiting long-term illness one and five years from baseline. Two machine learning approaches were used to build predictive models: deep learning via neural networks and XGBoost (gradient boosting decision trees). Model fit was compared to traditional logistic regression models. Results found that health-related measures had the strongest prediction of future health status, with genetic data performing poorly. Machine learning models only offered marginal improvements in model accuracy when compared to logistic regression models, but also performed well on other metrics e.g. neural networks were best on AUC and XGBoost on precision. The study suggests that increasing complexity of data and methods does not necessarily translate to improved understanding of the determinants of health or performance of predictive models of ill health. △ Less

Submitted 26 April, 2021; originally announced April 2021.

Comments: 19 pages

arXiv:2103.14950 [pdf, other]

The AI Settlement Generation Challenge in Minecraft: First Year Report

Authors: Christoph Salge, Michael Cerny Green, Rodrigo Canaan, Filip Skwarski, Rafael Fritsch, Adrian Brightmoore, Shaofang Ye, Changxing Cao, Julian Togelius

Abstract: This article outlines what we learned from the first year of the AI Settlement Generation Competition in Minecraft, a competition about producing AI programs that can generate interesting settlements in Minecraft for an unseen map. This challenge seeks to focus research into adaptive and holistic procedural content generation. Generating Minecraft towns and villages given existing maps is a suitab… ▽ More This article outlines what we learned from the first year of the AI Settlement Generation Competition in Minecraft, a competition about producing AI programs that can generate interesting settlements in Minecraft for an unseen map. This challenge seeks to focus research into adaptive and holistic procedural content generation. Generating Minecraft towns and villages given existing maps is a suitable task for this, as it requires the generated content to be adaptive, functional, evocative and aesthetic at the same time. Here, we present the results from the first iteration of the competition. We discuss the evaluation methodology, present the different technical approaches by the competitors, and outline the open problems. △ Less

Submitted 27 March, 2021; originally announced March 2021.

Comments: 14 pages, 9 figures, published in KI-Künstliche Intelligenz

Journal ref: KI-Künstliche Intelligenz 2020

arXiv:2102.10247 [pdf, other]

Game Mechanic Alignment Theory and Discovery

Authors: Michael Cerny Green, Ahmed Khalifa, Philip Bontrager, Rodrigo Canaan, Julian Togelius

Abstract: We present a new concept called Game Mechanic Alignment theory as a way to organize game mechanics through the lens of systemic rewards and agential motivations. By disentangling player and systemic influences, mechanics may be better identified for use in an automated tutorial generation system, which could tailor tutorials for a particular playstyle or player. Within, we apply this theory to sev… ▽ More We present a new concept called Game Mechanic Alignment theory as a way to organize game mechanics through the lens of systemic rewards and agential motivations. By disentangling player and systemic influences, mechanics may be better identified for use in an automated tutorial generation system, which could tailor tutorials for a particular playstyle or player. Within, we apply this theory to several well-known games to demonstrate how designers can benefit from it, we describe a methodology for how to estimate "mechanic alignment", and we apply this methodology on multiple games in the GVGAI framework. We discuss how effectively this estimation captures agential motivations and systemic rewards and how our theory could be used as an alternative way to find mechanics for tutorial generation. △ Less

Submitted 10 August, 2021; v1 submitted 19 February, 2021; originally announced February 2021.

Comments: 11 pages, 8 figures

arXiv:2101.07887 [pdf, other]

doi 10.1103/PhysRevLett.126.220503

Efficient, stabilized two-qubit gates on a trapped-ion quantum computer

Authors: Reinhold Blümel, Nikodem Grzesiak, Nhung H. Nguyen, Alaina M. Green, Ming Li, Andrii Maksymov, Norbert M. Linke, Yunseong Nam

Abstract: Quantum computing is currently limited by the cost of two-qubit entangling operations. In order to scale up quantum processors and achieve a quantum advantage, it is crucial to economize on the power requirement of two-qubit gates, make them robust to drift in experimental parameters, and shorten the gate times. In this paper, we present two methods, one exact and one approximate, to construct opt… ▽ More Quantum computing is currently limited by the cost of two-qubit entangling operations. In order to scale up quantum processors and achieve a quantum advantage, it is crucial to economize on the power requirement of two-qubit gates, make them robust to drift in experimental parameters, and shorten the gate times. In this paper, we present two methods, one exact and one approximate, to construct optimal pulses for entangling gates on a pair of ions within a trapped ion chain, one of the leading quantum computing architectures. Our methods are direct, non-iterative, and linear, and can construct gate-steering pulses requiring less power than the standard method by more than an order of magnitude in some parameter regimes. The power savings may generally be traded for reduced gate time and greater qubit connectivity. Additionally, our methods provide increased robustness to mode drift. We illustrate these trade-offs on a trapped-ion quantum computer. △ Less

Submitted 19 January, 2021; originally announced January 2021.

Journal ref: Phys. Rev. Lett. 126, 220503 (2021)

arXiv:2009.03977 [pdf, other]

Modeling Wildfire Perimeter Evolution using Deep Neural Networks

Authors: Maxfield E. Green, Karl Kaiser, Nat Shenton

Abstract: With the increased size and frequency of wildfire eventsworldwide, accurate real-time prediction of evolving wildfirefronts is a crucial component of firefighting efforts and for-est management practices. We propose a wildfire spreadingmodel that predicts the evolution of the wildfire perimeter in24 hour periods. The fire spreading simulation is based ona deep convolutional neural network (CNN) th… ▽ More With the increased size and frequency of wildfire eventsworldwide, accurate real-time prediction of evolving wildfirefronts is a crucial component of firefighting efforts and for-est management practices. We propose a wildfire spreadingmodel that predicts the evolution of the wildfire perimeter in24 hour periods. The fire spreading simulation is based ona deep convolutional neural network (CNN) that is trainedon remotely sensed atmospheric and environmental time se-ries data. We show that the model is able to learn wildfirespreading dynamics from real historic data sets from a seriesof wildfires in the Western Sierra Nevada Mountains in Cal-ifornia. We validate the model on a previously unseen wild-fire and produce realistic results that significantly outperformhistoric alternatives with validation accuracies ranging from78% - 98% △ Less

Submitted 8 September, 2020; originally announced September 2020.

arXiv:2007.04611 [pdf, other]

A deep learning approach to identify unhealthy advertisements in street view images

Authors: Gregory Palmer, Mark Green, Emma Boyland, Yales Stefano Rios Vasconcelos, Rahul Savani, Alex Singleton

Abstract: While outdoor advertisements are common features within towns and cities, they may reinforce social inequalities in health. Vulnerable populations in deprived areas may have greater exposure to fast food, gambling and alcohol advertisements encouraging their consumption. Understanding who is exposed and evaluating potential policy restrictions requires a substantial manual data collection effort.… ▽ More While outdoor advertisements are common features within towns and cities, they may reinforce social inequalities in health. Vulnerable populations in deprived areas may have greater exposure to fast food, gambling and alcohol advertisements encouraging their consumption. Understanding who is exposed and evaluating potential policy restrictions requires a substantial manual data collection effort. To address this problem we develop a deep learning workflow to automatically extract and classify unhealthy advertisements from street-level images. We introduce the Liverpool 360 Street View (LIV360SV) dataset for evaluating our workflow. The dataset contains 25,349, 360 degree, street-level images collected via cycling with a GoPro Fusion camera, recorded Jan 14th - 18th 2020. 10,106 advertisements were identified and classified as food (1335), alcohol (217), gambling (149) and other (8405) (e.g., cars and broadband). We find evidence of social inequalities with a larger proportion of food advertisements located within deprived areas and those frequented by students. Our project presents a novel implementation for the incidental classification of street view images for identifying unhealthy advertisements, providing a means through which to identify areas that can benefit from tougher advertisement restriction policies for tackling social inequalities. △ Less

Submitted 7 February, 2021; v1 submitted 9 July, 2020; originally announced July 2020.

Comments: 13 pages, 5 figures, 3 table. To appear in Nature Scientific Reports

arXiv:2002.04733 [pdf, other]

Mech-Elites: Illuminating the Mechanic Space of GVGAI

Authors: M Charity, Michael Cerny Green, Ahmed Khalifa, Julian Togelius

Abstract: This paper introduces a fully automatic method of mechanic illumination for general video game level generation. Using the Constrained MAP-Elites algorithm and the GVG-AI framework, this system generates the simplest tile based levels that contain specific sets of game mechanics and also satisfy playability constraints. We apply this method to illuminate mechanic space for $4$ different games in G… ▽ More This paper introduces a fully automatic method of mechanic illumination for general video game level generation. Using the Constrained MAP-Elites algorithm and the GVG-AI framework, this system generates the simplest tile based levels that contain specific sets of game mechanics and also satisfy playability constraints. We apply this method to illuminate mechanic space for $4$ different games in GVG-AI: Zelda, Solarfox, Plants, and RealPortals. △ Less

Submitted 24 August, 2022; v1 submitted 11 February, 2020; originally announced February 2020.

arXiv:2002.02992 [pdf, other]

Mario Level Generation From Mechanics Using Scene Stitching

Authors: Michael Cerny Green, Luvneesh Mugrai, Ahmed Khalifa, Julian Togelius

Abstract: This paper presents a level generation method for Super Mario by stitching together pre-generated "scenes" that contain specific mechanics, using mechanic-sequences from agent playthroughs as input specifications. Given a sequence of mechanics, our system uses an FI-2Pop algorithm and a corpus of scenes to perform automated level authoring. The system outputs levels that have a similar mechanical… ▽ More This paper presents a level generation method for Super Mario by stitching together pre-generated "scenes" that contain specific mechanics, using mechanic-sequences from agent playthroughs as input specifications. Given a sequence of mechanics, our system uses an FI-2Pop algorithm and a corpus of scenes to perform automated level authoring. The system outputs levels that have a similar mechanical sequence to the target mechanic sequence but with a different playthrough experience. We compare our system to a greedy method that selects scenes that maximize the target mechanics. Our system is able to maximize the number of matched mechanics while reducing emergent mechanics using the stitching process compared to the greedy approach. △ Less

Submitted 7 February, 2020; originally announced February 2020.

Comments: 10 pages, 7 figures, submitted to Foundations of Digital Games Conference

arXiv:1910.01603 [pdf, other]

Bootstrap** Conditional GANs for Video Game Level Generation

Authors: Ruben Rodriguez Torrado, Ahmed Khalifa, Michael Cerny Green, Niels Justesen, Sebastian Risi, Julian Togelius

Abstract: Generative Adversarial Networks (GANs) have shown im-pressive results for image generation. However, GANs facechallenges in generating contents with certain types of con-straints, such as game levels. Specifically, it is difficult togenerate levels that have aesthetic appeal and are playable atthe same time. Additionally, because training data usually islimited, it is challenging to generate uniqu… ▽ More Generative Adversarial Networks (GANs) have shown im-pressive results for image generation. However, GANs facechallenges in generating contents with certain types of con-straints, such as game levels. Specifically, it is difficult togenerate levels that have aesthetic appeal and are playable atthe same time. Additionally, because training data usually islimited, it is challenging to generate unique levels with cur-rent GANs. In this paper, we propose a new GAN architec-ture namedConditional Embedding Self-Attention Genera-tive Adversarial Network(CESAGAN) and a new bootstrap-** training procedure. The CESAGAN is a modification ofthe self-attention GAN that incorporates an embedding fea-ture vector input to condition the training of the discriminatorand generator. This allows the network to model non-localdependency between game objects, and to count objects. Ad-ditionally, to reduce the number of levels necessary to trainthe GAN, we propose a bootstrap** mechanism in whichplayable generated levels are added to the training set. Theresults demonstrate that the new approach does not only gen-erate a larger number of levels that are playable but also gen-erates fewer duplicate levels compared to a standard GAN. △ Less

Submitted 3 October, 2019; originally announced October 2019.

arXiv:1909.03094 [pdf, other]

Automatic Critical Mechanic Discovery Using Playtraces in Video Games

Authors: Michael Cerny Green, Ahmed Khalifa, Gabriella A. B. Barros, Tiago Machado, Julian Togelius

Abstract: We present a new method of automatic critical mechanic discovery for video games using a combination of game description parsing and playtrace information. This method is applied to several games within the General Video Game Artificial Intelligence (GVG-AI) framework. In a user study, human-identified mechanics are compared against system-identified critical mechanics to verify alignment between… ▽ More We present a new method of automatic critical mechanic discovery for video games using a combination of game description parsing and playtrace information. This method is applied to several games within the General Video Game Artificial Intelligence (GVG-AI) framework. In a user study, human-identified mechanics are compared against system-identified critical mechanics to verify alignment between humans and the system. The results of the study demonstrate that the new method is able to match humans with higher consistency than baseline. Our system is further validated by comparing MCTS agents augmented with critical mechanics and vanilla MCTS agents on $4$ games from GVG-AI. Our new playtrace method shows a significant performance improvement over the baseline for all 4 tested games. The proposed method also shows either matched or improved performance over the old method, demonstrating that playtrace information is responsible for more complete critical mechanic discovery. △ Less

Submitted 15 September, 2020; v1 submitted 6 September, 2019; originally announced September 2019.

Comments: 15 pages, 4 figures, 2 tables, 1 algorithm, 1 equation

arXiv:1906.05160 [pdf, other]

General Video Game Rule Generation

Authors: Ahmed Khalifa, Michael Cerny Green, Diego Perez-Liebana, Julian Togelius

Abstract: We introduce the General Video Game Rule Generation problem, and the eponymous software framework which will be used in a new track of the General Video Game AI (GVGAI) competition. The problem is, given a game level as input, to generate the rules of a game that fits that level. This can be seen as the inverse of the General Video Game Level Generation problem. Conceptualizing these two problems… ▽ More We introduce the General Video Game Rule Generation problem, and the eponymous software framework which will be used in a new track of the General Video Game AI (GVGAI) competition. The problem is, given a game level as input, to generate the rules of a game that fits that level. This can be seen as the inverse of the General Video Game Level Generation problem. Conceptualizing these two problems as separate helps breaking the very hard problem of generating complete games into smaller, more manageable subproblems. The proposed framework builds on the GVGAI software and thus asks the rule generator for rules defined in the Video Game Description Language. We describe the API, and three different rule generators: a random, a constructive and a search-based generator. Early results indicate that the constructive generator generates playable and somewhat interesting game rules but has a limited expressive range, whereas the search-based generator generates remarkably diverse rulesets, but with an uneven quality. △ Less

Submitted 12 June, 2019; originally announced June 2019.

Comments: 8 pages, 9 listings, 1 table, 2 figures

arXiv:1906.05094 [pdf, other]

Organic Building Generation in Minecraft

Authors: Michael Cerny Green, Christoph Salge, Julian Togelius

Abstract: This paper presents a method for generating floor plans for structures in Minecraft (Mojang 2009). Given a 3D space, it will auto-generate a building to fill that space using a combination of constrained growth and cellular automata. The result is a series of organic-looking buildings complete with rooms, windows, and doors connecting them. The method is applied to the Generative Design in Minecra… ▽ More This paper presents a method for generating floor plans for structures in Minecraft (Mojang 2009). Given a 3D space, it will auto-generate a building to fill that space using a combination of constrained growth and cellular automata. The result is a series of organic-looking buildings complete with rooms, windows, and doors connecting them. The method is applied to the Generative Design in Minecraft (GDMC) competition to auto-generate buildings in Minecraft, and the results are discussed. △ Less

Submitted 11 June, 2019; originally announced June 2019.

Comments: 7 pages, 9 figures, published at PCG workshop at the Foundations of Digital Games Conference 2019

arXiv:1906.04660 [pdf, other]

Two-step Constructive Approaches for Dungeon Generation

Authors: Michael Cerny Green, Ahmed Khalifa, Athoug Alsoughayer, Divyesh Surana, Antonios Liapis, Julian Togelius

Abstract: This paper presents a two-step generative approach for creating dungeons in the rogue-like puzzle game MiniDungeons 2. Generation is split into two steps, initially producing the architectural layout of the level as its walls and floor tiles, and then furnishing it with game objects representing the player's start and goal position, challenges and rewards. Three layout creators and three furnisher… ▽ More This paper presents a two-step generative approach for creating dungeons in the rogue-like puzzle game MiniDungeons 2. Generation is split into two steps, initially producing the architectural layout of the level as its walls and floor tiles, and then furnishing it with game objects representing the player's start and goal position, challenges and rewards. Three layout creators and three furnishers are introduced in this paper, which can be combined in different ways in the two-step generative process for producing diverse dungeons levels. Layout creators generate the floors and walls of a level, while furnishers populate it with monsters, traps, and treasures. We test the generated levels on several expressivity measures, and in simulations with procedural persona agents. △ Less

Submitted 11 June, 2019; originally announced June 2019.

Comments: 7 pages, 4 figures, published at PCG workshop at the Foundations of Digital Games Conference 2019

arXiv:1905.05888 [pdf, other]

Generative Design in Minecraft: Chronicle Challenge

Authors: Christoph Salge, Christian Guckelsberger, Michael Cerny Green, Rodrigo Canaan, Julian Togelius

Abstract: We introduce the Chronicle Challenge as an optional addition to the Settlement Generation Challenge in Minecraft. One of the foci of the overall competition is adaptive procedural content generation (PCG), an arguably under-explored problem in computational creativity. In the base challenge, participants must generate new settlements that respond to and ideally interact with existing content in th… ▽ More We introduce the Chronicle Challenge as an optional addition to the Settlement Generation Challenge in Minecraft. One of the foci of the overall competition is adaptive procedural content generation (PCG), an arguably under-explored problem in computational creativity. In the base challenge, participants must generate new settlements that respond to and ideally interact with existing content in the world, such as the landscape or climate. The goal is to understand the underlying creative process, and to design better PCG systems. The Chronicle Challenge in particular focuses on the generation of a narrative based on the history of a generated settlement, expressed in natural language. We discuss the unique features of the Chronicle Challenge in comparison to other competitions, clarify the characteristics of a chronicle eligible for submission and describe the evaluation criteria. We furthermore draw on simulation-based approaches in computational storytelling as examples to how this challenge could be approached. △ Less

Submitted 14 May, 2019; originally announced May 2019.

Comments: 5 pages, 1 Figure, accepted as late-breaking paper at ICCC 2019, 10th International Conference on Computational Creativity

arXiv:1904.08972 [pdf, other]

Intentional Computational Level Design

Authors: Ahmed Khalifa, Michael Cerny Green, Gabriella Barros, Julian Togelius

Abstract: The procedural generation of levels and content in video games is a challenging AI problem. Often such generation relies on an intelligent way of evaluating the content being generated so that constraints are satisfied and/or objectives maximized. In this work, we address the problem of creating levels that are not only playable but also revolve around specific mechanics in the game. We use constr… ▽ More The procedural generation of levels and content in video games is a challenging AI problem. Often such generation relies on an intelligent way of evaluating the content being generated so that constraints are satisfied and/or objectives maximized. In this work, we address the problem of creating levels that are not only playable but also revolve around specific mechanics in the game. We use constrained evolutionary algorithms and quality-diversity algorithms to generate small sections of Super Mario Bros levels called scenes, using three different simulation approaches: Limited Agents, Punishing Model, and Mechanics Dimensions. All three approaches are able to create scenes that give opportunity for a player to encounter or use targeted mechanics with different properties. We conclude by discussing the advantages and disadvantages of each approach and compare them to each other. △ Less

Submitted 18 April, 2019; originally announced April 2019.

Comments: 8 pages, 10 figures, 3 tables, GECCO 2019

arXiv:1904.06425 [pdf, other]

KeyForge: Mitigating Email Breaches with Forward-Forgeable Signatures

Authors: Michael Specter, Sunoo Park, Matthew Green

Abstract: Email breaches are commonplace, and they expose a wealth of personal, business, and political data that may have devastating consequences. The current email system allows any attacker who gains access to your email to prove the authenticity of the stolen messages to third parties -- a property arising from a necessary anti-spam / anti-spoofing protocol called DKIM. This exacerbates the problem of… ▽ More Email breaches are commonplace, and they expose a wealth of personal, business, and political data that may have devastating consequences. The current email system allows any attacker who gains access to your email to prove the authenticity of the stolen messages to third parties -- a property arising from a necessary anti-spam / anti-spoofing protocol called DKIM. This exacerbates the problem of email breaches by greatly increasing the potential for attackers to damage the users' reputation, blackmail them, or sell the stolen information to third parties. In this paper, we introduce "non-attributable email", which guarantees that a wide class of adversaries are unable to convince any third party of the authenticity of stolen emails. We formally define non-attributability, and present two practical system proposals -- KeyForge and TimeForge -- that provably achieve non-attributability while maintaining the important protection against spam and spoofing that is currently provided by DKIM. Moreover, we implement KeyForge and demonstrate that that scheme is practical, achieving competitive verification and signing speed while also requiring 42% less bandwidth per email than RSA2048. △ Less

Submitted 12 April, 2019; originally announced April 2019.

arXiv:1903.11678 [pdf, other]

Tree Search vs Optimization Approaches for Map Generation

Authors: Debosmita Bhaumik, Ahmed Khalifa, Michael Cerny Green, Julian Togelius

Abstract: Search-based procedural content generation uses stochastic global optimization algorithms to search for game content. However, standard tree search algorithms can be competitive with evolution on some optimization problems. We investigate the applicability of several tree search methods to level generation and compare them systematically with several optimization algorithms, including evolutionary… ▽ More Search-based procedural content generation uses stochastic global optimization algorithms to search for game content. However, standard tree search algorithms can be competitive with evolution on some optimization problems. We investigate the applicability of several tree search methods to level generation and compare them systematically with several optimization algorithms, including evolutionary algorithms. We compare them on three different game level generation problems: Binary, Zelda, and Sokoban. We introduce two new representations that can help tree search algorithms deal with the large branching factor of the generation problem. We find that in general, optimization algorithms clearly outperform tree search algorithms, but given the right problem representation certain tree search algorithms perform similarly to optimization algorithms, and in one particular problem, we see surprisingly strong results from MCTS. △ Less

Submitted 12 August, 2020; v1 submitted 27 March, 2019; originally announced March 2019.

Comments: 10 pages, 9 figures, published at AIIDE 2020

arXiv:1901.05431 [pdf, other]

Evolutionarily-Curated Curriculum Learning for Deep Reinforcement Learning Agents

Authors: Michael Cerny Green, Benjamin Sergent, Pushyami Shandilya, Vibhor Kumar

Abstract: In this paper we propose a new training loop for deep reinforcement learning agents with an evolutionary generator. Evolutionary procedural content generation has been used in the creation of maps and levels for games before. Our system incorporates an evolutionary map generator to construct a training curriculum that is evolved to maximize loss within the state-of-the-art Double Dueling Deep Q Ne… ▽ More In this paper we propose a new training loop for deep reinforcement learning agents with an evolutionary generator. Evolutionary procedural content generation has been used in the creation of maps and levels for games before. Our system incorporates an evolutionary map generator to construct a training curriculum that is evolved to maximize loss within the state-of-the-art Double Dueling Deep Q Network architecture with prioritized replay. We present a case-study in which we prove the efficacy of our new method on a game with a discrete, large action space we made called Attackers and Defenders. Our results demonstrate that training on an evolutionarily-curated curriculum (directed sampling) of maps both expedites training and improves generalization when compared to a network trained on an undirected sampling of maps. △ Less

Submitted 16 January, 2019; originally announced January 2019.

Comments: 9 pages, 7 figures, accepted to the Reinforcement Learning in Games workshop at AAAI 2019

arXiv:1810.02251 [pdf, other]

doi 10.1145/3235765.3235792

DATA Agent

Authors: Michael Cerny Green, Gabriella A. B. Barros, Antonios Liapis, Julian Togelius

Abstract: This paper introduces DATA Agent, a system which creates murder mystery adventures from open data. In the game, the player takes on the role of a detective tasked with finding the culprit of a murder. All characters, places, and items in DATA Agent games are generated using open data as source content. The paper discusses the general game design and user interface of DATA Agent, and provides detai… ▽ More This paper introduces DATA Agent, a system which creates murder mystery adventures from open data. In the game, the player takes on the role of a detective tasked with finding the culprit of a murder. All characters, places, and items in DATA Agent games are generated using open data as source content. The paper discusses the general game design and user interface of DATA Agent, and provides details on the generative algorithms which transform linked data into different game objects. Findings from a user study with 30 participants playing through two games of DATA Agent show that the game is easy and fun to play, and that the mysteries it generates are straightforward to solve. △ Less

Submitted 28 September, 2018; originally announced October 2018.

Comments: 8 pages, 4 images, 3 tables

Journal ref: Foundations of Digital Games (FDG) 2018

arXiv:1808.08274 [pdf, other]

Can we leverage rating patterns from traditional users to enhance recommendations for children?

Authors: Ion Madrazo Azpiazu, Michael Green, Oghenemaro Anuyah, Maria Soledad Pera

Abstract: Recommender algorithms performance is often associated with the availability of sufficient historical rating data. Unfortunately, when it comes to children, this data is seldom available. In this paper, we report on an initial analysis conducted to examine the degree to which data about traditional users, i.e., adults, can be leveraged to enhance the recommendation process for children. Recommender algorithms performance is often associated with the availability of sufficient historical rating data. Unfortunately, when it comes to children, this data is seldom available. In this paper, we report on an initial analysis conducted to examine the degree to which data about traditional users, i.e., adults, can be leveraged to enhance the recommendation process for children. △ Less

Submitted 24 August, 2018; originally announced August 2018.

Comments: ACM RecSys 2018

arXiv:1807.06734 [pdf, other]

doi 10.1145/3235765.3235820

Generating Levels That Teach Mechanics

Authors: Michael Cerny Green, Ahmed Khalifa, Gabriella A. B. Barros, Andy Nealen, Julian Togelius

Abstract: The automatic generation of game tutorials is a challenging AI problem. While it is possible to generate annotations and instructions that explain to the player how the game is played, this paper focuses on generating a gameplay experience that introduces the player to a game mechanic. It evolves small levels for the Mario AI Framework that can only be beaten by an agent that knows how to perform… ▽ More The automatic generation of game tutorials is a challenging AI problem. While it is possible to generate annotations and instructions that explain to the player how the game is played, this paper focuses on generating a gameplay experience that introduces the player to a game mechanic. It evolves small levels for the Mario AI Framework that can only be beaten by an agent that knows how to perform specific actions in the game. It uses variations of a perfect A* agent that are limited in various ways, such as not being able to jump high or see enemies, to test how failing to do certain actions can stop the player from beating the level. △ Less

Submitted 1 October, 2018; v1 submitted 17 July, 2018; originally announced July 2018.

Comments: 8 pages, 7 figures, PCG Workshop at FDG 2018, 9th International Workshop on Procedural Content Generation (PCG2018)

arXiv:1807.04375 [pdf, other]

doi 10.1145/3235765.3235790

AtDelfi: Automatically Designing Legible, Full Instructions For Games

Authors: Michael Cerny Green, Ahmed Khalifa, Gabriella A. B. Barros, Tiago Machado, Andy Nealen, Julian Togelius

Abstract: This paper introduces a fully automatic method for generating video game tutorials. The AtDELFI system (AuTomatically DEsigning Legible, Full Instructions for games) was created to investigate procedural generation of instructions that teach players how to play video games. We present a representation of game rules and mechanics using a graph system as well as a tutorial generation method that use… ▽ More This paper introduces a fully automatic method for generating video game tutorials. The AtDELFI system (AuTomatically DEsigning Legible, Full Instructions for games) was created to investigate procedural generation of instructions that teach players how to play video games. We present a representation of game rules and mechanics using a graph system as well as a tutorial generation method that uses said graph representation. We demonstrate the concept by testing it on games within the General Video Game Artificial Intelligence (GVG-AI) framework; the paper discusses tutorials generated for eight different games. Our findings suggest that a graph representation scheme works well for simple arcade style games such as Space Invaders and Pacman, but it appears that tutorials for more complex games might require higher-level understanding of the game than just single mechanics. △ Less

Submitted 17 September, 2018; v1 submitted 11 July, 2018; originally announced July 2018.

Comments: 10 pages, 11 figures, published at Foundations of Digital Games Conference 2018

Journal ref: Foundations of Digital Games (FDG) 2018

arXiv:1805.12475 [pdf, other]

Data-driven Design: A Case for Maximalist Game Design

Authors: Gabriella A. B. Barros, Michael Cerny Green, Antonios Liapis, Julian Togelius

Abstract: Maximalism in art refers to drawing on and combining multiple different sources for art creation, embracing the resulting collisions and heterogeneity. This paper discusses the use of maximalism in game design and particularly in data games, which are games that are generated partly based on open data. Using Data Adventures, a series of generators that create adventure games from data sources such… ▽ More Maximalism in art refers to drawing on and combining multiple different sources for art creation, embracing the resulting collisions and heterogeneity. This paper discusses the use of maximalism in game design and particularly in data games, which are games that are generated partly based on open data. Using Data Adventures, a series of generators that create adventure games from data sources such as Wikipedia and OpenStreetMap, as a lens we explore several tradeoffs and issues in maximalist game design. This includes the tension between transformation and fidelity, between decorative and functional content, and legal and ethical issues resulting from this type of generativity. This paper sketches out the design space of maximalist data-driven games, a design space that is mostly unexplored. △ Less

Submitted 29 May, 2018; originally announced May 2018.

Comments: 9 pages, 2 Figures, Accepted in ICCC 2018

arXiv:1805.11768 [pdf, other]

"Press Space to Fire": Automatic Video Game Tutorial Generation

Authors: Michael Cerny Green, Ahmed Khalifa, Gabriella A. B. Barros, Julian Togelius

Abstract: We propose the problem of tutorial generation for games, i.e. to generate tutorials which can teach players to play games, as an AI problem. This problem can be approached in several ways, including generating natural language descriptions of game rules, generating instructive game levels, and generating demonstrations of how to play a game using agents that play in a human-like manner. We further… ▽ More We propose the problem of tutorial generation for games, i.e. to generate tutorials which can teach players to play games, as an AI problem. This problem can be approached in several ways, including generating natural language descriptions of game rules, generating instructive game levels, and generating demonstrations of how to play a game using agents that play in a human-like manner. We further argue that the General Video Game AI framework provides a useful testbed for addressing this problem. △ Less

Submitted 29 May, 2018; originally announced May 2018.

Comments: 6 pages, 4 figures, 1 table, Published at the EXAG workshop as a part of AIIDE 2017

arXiv:1803.09853 [pdf, other]

doi 10.1145/3235765.3235814

Generative Design in Minecraft (GDMC), Settlement Generation Competition

Authors: Christoph Salge, Michael Cerny Green, Rodrigo Canaan, Julian Togelius

Abstract: This paper introduces the settlement generation competition for Minecraft, the first part of the Generative Design in Minecraft challenge. The settlement generation competition is about creating Artificial Intelligence (AI) agents that can produce functional, aesthetically appealing and believable settlements adapted to a given Minecraft map - ideally at a level that can compete with human created… ▽ More This paper introduces the settlement generation competition for Minecraft, the first part of the Generative Design in Minecraft challenge. The settlement generation competition is about creating Artificial Intelligence (AI) agents that can produce functional, aesthetically appealing and believable settlements adapted to a given Minecraft map - ideally at a level that can compete with human created designs. The aim of the competition is to advance procedural content generation for games, especially in overcoming the challenges of adaptive and holistic PCG. The paper introduces the technical details of the challenge, but mostly focuses on what challenges this competition provides and why they are scientifically relevant. △ Less

Submitted 30 July, 2018; v1 submitted 26 March, 2018; originally announced March 2018.

Comments: 10 pages, 5 figures, Part of the Foundations of Digital Games 2018 proceedings, as part of the workshop on Procedural Content Generation

Journal ref: In Foundations of Digital Games 2018 (FDG18), August 7-10, 2018, Malmö, Sweden. ACM, New York, NY, USA, 10 pages

arXiv:1802.06881 [pdf, other]

Automated Playtesting with Procedural Personas through MCTS with Evolved Heuristics

Authors: Christoffer Holmgård, Michael Cerny Green, Antonios Liapis, Julian Togelius

Abstract: This paper describes a method for generative player modeling and its application to the automatic testing of game content using archetypal player models called procedural personas. Theoretically grounded in psychological decision theory, procedural personas are implemented using a variation of Monte Carlo Tree Search (MCTS) where the node selection criteria are developed using evolutionary computa… ▽ More This paper describes a method for generative player modeling and its application to the automatic testing of game content using archetypal player models called procedural personas. Theoretically grounded in psychological decision theory, procedural personas are implemented using a variation of Monte Carlo Tree Search (MCTS) where the node selection criteria are developed using evolutionary computation, replacing the standard UCB1 criterion of MCTS. Using these personas we demonstrate how generative player models can be applied to a varied corpus of game levels and demonstrate how different play styles can be enacted in each level. In short, we use artificially intelligent personas to construct synthetic playtesters. The proposed approach could be used as a tool for automatic play testing when human feedback is not readily available or when quick visualization of potential interactions is necessary. Possible applications include interactive tools during game development or procedural content generation systems where many evaluations must be conducted within a short time span. △ Less

Submitted 19 February, 2018; originally announced February 2018.

Comments: 10 pages, 6 figures

arXiv:1802.05219 [pdf, other]

Who Killed Albert Einstein? From Open Data to Murder Mystery Games

Authors: Gabriella A. B. Barros, Michael Cerny Green, Antonios Liapis, Julian Togelius

Abstract: This paper presents a framework for generating adventure games from open data. Focusing on the murder mystery type of adventure games, the generator is able to transform open data from Wikipedia articles, OpenStreetMap and images from Wikimedia Commons into WikiMysteries. Every WikiMystery game revolves around the murder of a person with a Wikipedia article and populates the game with suspects who… ▽ More This paper presents a framework for generating adventure games from open data. Focusing on the murder mystery type of adventure games, the generator is able to transform open data from Wikipedia articles, OpenStreetMap and images from Wikimedia Commons into WikiMysteries. Every WikiMystery game revolves around the murder of a person with a Wikipedia article and populates the game with suspects who must be arrested by the player if guilty of the murder or absolved if innocent. Starting from only one person as the victim, an extensive generative pipeline finds suspects, their alibis, and paths connecting them from open data, transforms open data into cities, buildings, non-player characters, locks and keys and dialog options. The paper describes in detail each generative step, provides a specific playthrough of one WikiMystery where Albert Einstein is murdered, and evaluates the outcomes of games generated for the 100 most influential people of the 20th century. △ Less

Submitted 14 February, 2018; originally announced February 2018.

Comments: 11 pages, 6 figures, 2 tables

Journal ref: 10.1109/TG.2018.2806190

Showing 1–50 of 53 results for author: Green, M