Search | arXiv e-print repository

Evolving Code with A Large Language Model

Authors: Erik Hemberg, Stephen Moskal, Una-May O'Reilly

Abstract: Algorithms that use Large Language Models (LLMs) to evolve code arrived on the Genetic Programming (GP) scene very recently. We present LLM GP, a formalized LLM-based evolutionary algorithm designed to evolve code. Like GP, it uses evolutionary operators, but its designs and implementations of those operators radically differ from GP's because they enlist an LLM, using prompting and the LLM's pre-… ▽ More Algorithms that use Large Language Models (LLMs) to evolve code arrived on the Genetic Programming (GP) scene very recently. We present LLM GP, a formalized LLM-based evolutionary algorithm designed to evolve code. Like GP, it uses evolutionary operators, but its designs and implementations of those operators radically differ from GP's because they enlist an LLM, using prompting and the LLM's pre-trained pattern matching and sequence completion capability. We also present a demonstration-level variant of LLM GP and share its code. By addressing algorithms that range from the formal to hands-on, we cover design and LLM-usage considerations as well as the scientific challenges that arise when using an LLM for genetic programming. △ Less

Submitted 13 January, 2024; originally announced January 2024.

Comments: 34 pages, 9 figures, 6 Tables

ACM Class: I.2.8

arXiv:2310.06936 [pdf, other]

LLMs Killed the Script Kiddie: How Agents Supported by Large Language Models Change the Landscape of Network Threat Testing

Authors: Stephen Moskal, Sam Laney, Erik Hemberg, Una-May O'Reilly

Abstract: In this paper, we explore the potential of Large Language Models (LLMs) to reason about threats, generate information about tools, and automate cyber campaigns. We begin with a manual exploration of LLMs in supporting specific threat-related actions and decisions. We proceed by automating the decision process in a cyber campaign. We present prompt engineering approaches for a plan-act-report loop… ▽ More In this paper, we explore the potential of Large Language Models (LLMs) to reason about threats, generate information about tools, and automate cyber campaigns. We begin with a manual exploration of LLMs in supporting specific threat-related actions and decisions. We proceed by automating the decision process in a cyber campaign. We present prompt engineering approaches for a plan-act-report loop for one action of a threat campaign and and a prompt chaining design that directs the sequential decision process of a multi-action campaign. We assess the extent of LLM's cyber-specific knowledge w.r.t the short campaign we demonstrate and provide insights into prompt design for eliciting actionable responses. We discuss the potential impact of LLMs on the threat landscape and the ethical considerations of using LLMs for accelerating threat actor capabilities. We report a promising, yet concerning, application of generative AI to cyber threats. However, the LLM's capabilities to deal with more complex networks, sophisticated vulnerabilities, and the sensitivity of prompts are open questions. This research should spur deliberations over the inevitable advancements in LLM-supported cyber adversarial landscape. △ Less

Submitted 10 October, 2023; originally announced October 2023.

arXiv:2308.03191 [pdf, other]

The Facebook Algorithm's Active Role in Climate Advertisement Delivery

Authors: Aruna Sankaranarayanan, Erik Hemberg, Una-May O'Reilly

Abstract: Communication strongly influences attitudes on climate change. Within sponsored communication, high spend and high reach advertising dominates. In the advertising ecosystem we can distinguish actors with adversarial stances: organizations with contrarian or advocacy communication goals, who direct the advertisement delivery algorithm to launch ads in different destinations by specifying targets an… ▽ More Communication strongly influences attitudes on climate change. Within sponsored communication, high spend and high reach advertising dominates. In the advertising ecosystem we can distinguish actors with adversarial stances: organizations with contrarian or advocacy communication goals, who direct the advertisement delivery algorithm to launch ads in different destinations by specifying targets and campaign objectives. We present an observational (N=275,632) and a controlled (N=650) study which collectively indicate that the advertising delivery algorithm could itself be an actor, asserting statistically significant influence over advertisement destinations, characterized by U.S. state, gender type, or age range. This algorithmic behaviour may not entirely be understood by the advertising platform (and its creators). These findings have implications for climate communications and misinformation research, revealing that targeting intentions are not always fulfilled as requested and that delivery itself could be manipulated. △ Less

Submitted 7 August, 2023; v1 submitted 6 August, 2023; originally announced August 2023.

arXiv:2108.11025 [pdf, other]

Evaluating Efficacy of Indoor Non-Pharmaceutical Interventions against COVID-19 Outbreaks with a Coupled Spatial-SIR Agent-Based Simulation Framework

Authors: Chathika Gunaratne, Rene Reyes, Erik Hemberg, Una-May O'Reilly

Abstract: Contagious respiratory diseases, such as COVID-19, depend on sufficiently prolonged exposures for the successful transmission of the underlying pathogen. It is important for organizations to evaluate the efficacy of interventions aiming at mitigating viral transmission among their personnel. We have developed a operational risk assessment simulation framework that couples a spatial agent-based mod… ▽ More Contagious respiratory diseases, such as COVID-19, depend on sufficiently prolonged exposures for the successful transmission of the underlying pathogen. It is important for organizations to evaluate the efficacy of interventions aiming at mitigating viral transmission among their personnel. We have developed a operational risk assessment simulation framework that couples a spatial agent-based model of movement with a SIR epidemiological model to assess the relative risks of different intervention strategies. By applying our model on MIT's STATA building, we assess the impacts of three possible dimensions of intervention: one-way vs unrestricted movement, population size allowed onsite, and frequency of leaving designated work location for breaks. We find that there is no significant impact made by one-way movement restrictions over unrestricted movement. Instead, we find that a combination of lowering the number of individuals admitted below the current recommendations and advising individuals to reduce the frequency at which they leave their workstations lowers the likelihood of highly connected individuals within the contact networks that emerge, which in turn lowers the overall risk of infection. We discover three classes of possible interventions based on their epidemiological effects. By assuming a direct relationship between data on secondary attack rates and transmissibility in the SIR model, we compare relative infection risk of four respiratory diseases, MERS, SARS, COVID-19, and Measles, within the simulated area, and recommend appropriate intervention guidelines. △ Less

Submitted 24 August, 2021; originally announced August 2021.

arXiv:2108.02618 [pdf, other]

Using a Collated Cybersecurity Dataset for Machine Learning and Artificial Intelligence

Authors: Erik Hemberg, Una-May O'Reilly

Abstract: Artificial Intelligence (AI) and Machine Learning (ML) algorithms can support the span of indicator-level, e.g. anomaly detection, to behavioral level cyber security modeling and inference. This contribution is based on a dataset named BRON which is amalgamated from public threat and vulnerability behavioral sources. We demonstrate how BRON can support prediction of related threat techniques and a… ▽ More Artificial Intelligence (AI) and Machine Learning (ML) algorithms can support the span of indicator-level, e.g. anomaly detection, to behavioral level cyber security modeling and inference. This contribution is based on a dataset named BRON which is amalgamated from public threat and vulnerability behavioral sources. We demonstrate how BRON can support prediction of related threat techniques and attack patterns. We also discuss other AI and ML uses of BRON to exploit its behavioral knowledge. △ Less

Submitted 5 August, 2021; originally announced August 2021.

Comments: 5 pages, 2 Figures, 2 Tables, ACM KDD AI4Cyber: The 1st Workshop on Artificial Intelligence- enabled Cybersecurity Analytics at KDD'21

arXiv:2106.13590 [pdf, ps, other]

Fostering Diversity in Spatial Evolutionary Generative Adversarial Networks

Authors: Jamal Toutouh, Erik Hemberg, Una-May O'Reilly

Abstract: Generative adversary networks (GANs) suffer from training pathologies such as instability and mode collapse, which mainly arise from a lack of diversity in their adversarial interactions. Co-evolutionary GAN (CoE-GAN) training algorithms have shown to be resilient to these pathologies. This article introduces Mustangs, a spatially distributed CoE-GAN, which fosters diversity by using different los… ▽ More Generative adversary networks (GANs) suffer from training pathologies such as instability and mode collapse, which mainly arise from a lack of diversity in their adversarial interactions. Co-evolutionary GAN (CoE-GAN) training algorithms have shown to be resilient to these pathologies. This article introduces Mustangs, a spatially distributed CoE-GAN, which fosters diversity by using different loss functions during the training. Experimental analysis on MNIST and CelebA demonstrated that Mustangs trains statistically more accurate generators. △ Less

Submitted 25 June, 2021; originally announced June 2021.

Comments: Accepted to be presented during Conference of the Spanish Association of Artificial Intelligence (CAEPIA 2021). arXiv admin note: substantial text overlap with arXiv:1905.12702

arXiv:2104.13254

Proceedings - AI/ML for Cybersecurity: Challenges, Solutions, and Novel Ideas at SIAM Data Mining 2021

Authors: John Emanuello, Kimberly Ferguson-Walter, Erik Hemberg, Una-May O Reilly, Ahmad Ridley, Dennis Ross, Diane Staheli, William Streilein

Abstract: Malicious cyber activity is ubiquitous and its harmful effects have dramatic and often irreversible impacts on society. Given the shortage of cybersecurity professionals, the ever-evolving adversary, the massive amounts of data which could contain evidence of an attack, and the speed at which defensive actions must be taken, innovations which enable autonomy in cybersecurity must continue to expan… ▽ More Malicious cyber activity is ubiquitous and its harmful effects have dramatic and often irreversible impacts on society. Given the shortage of cybersecurity professionals, the ever-evolving adversary, the massive amounts of data which could contain evidence of an attack, and the speed at which defensive actions must be taken, innovations which enable autonomy in cybersecurity must continue to expand, in order to move away from a reactive defense posture and towards a more proactive one. The challenges in this space are quite different from those associated with applying AI in other domains such as computer vision. The environment suffers from an incredibly high degree of uncertainty, stemming from the intractability of ingesting all the available data, as well as the possibility that malicious actors are manipulating the data. Another unique challenge in this space is the dynamism of the adversary causes the indicators of compromise to change frequently and without warning. In spite of these challenges, machine learning has been applied to this domain and has achieved some success in the realm of detection. While this aspect of the problem is far from solved, a growing part of the commercial sector is providing ML-enhanced capabilities as a service. Many of these entities also provide platforms which facilitate the deployment of these automated solutions. Academic research in this space is growing and continues to influence current solutions, as well as strengthen foundational knowledge which will make autonomous agents in this space a possibility. △ Less

Submitted 1 June, 2021; v1 submitted 27 April, 2021; originally announced April 2021.

arXiv:2104.11576 [pdf, other]

Automating Cyber Threat Hunting Using NLP, Automated Query Generation, and Genetic Perturbation

Authors: Prakruthi Karuna, Erik Hemberg, Una-May O'Reilly, Nick Rutar

Abstract: Scaling the cyber hunt problem poses several key technical challenges. Detecting and characterizing cyber threats at scale in large enterprise networks is hard because of the vast quantity and complexity of the data that must be analyzed as adversaries deploy varied and evolving tactics to accomplish their goals. There is a great need to automate all aspects, and, indeed, the workflow of cyber hun… ▽ More Scaling the cyber hunt problem poses several key technical challenges. Detecting and characterizing cyber threats at scale in large enterprise networks is hard because of the vast quantity and complexity of the data that must be analyzed as adversaries deploy varied and evolving tactics to accomplish their goals. There is a great need to automate all aspects, and, indeed, the workflow of cyber hunting. AI offers many ways to support this. We have developed the WILEE system that automates cyber threat hunting by translating high-level threat descriptions into many possible concrete implementations. Both the (high-level) abstract and (low-level) concrete implementations are represented using a custom domain specific language (DSL). WILEE uses the implementations along with other logic, also written in the DSL, to automatically generate queries to confirm (or refute) any hypotheses tied to the potential adversarial workflows represented at various layers of abstraction. △ Less

Submitted 23 April, 2021; originally announced April 2021.

Comments: 5 pages 8 figures

arXiv:2010.00533 [pdf, other]

Linking Threat Tactics, Techniques, and Patterns with Defensive Weaknesses, Vulnerabilities and Affected Platform Configurations for Cyber Hunting

Authors: Erik Hemberg, Jonathan Kelly, Michal Shlapentokh-Rothman, Bryn Reinstadler, Katherine Xu, Nick Rutar, Una-May O'Reilly

Abstract: Many public sources of cyber threat and vulnerability information exist to help defend cyber systems. This paper links MITRE's ATT&CK MATRIX of Tactics and Techniques, NIST's Common Weakness Enumerations (CWE), Common Vulnerabilities and Exposures (CVE), and Common Attack Pattern Enumeration and Classification list (CAPEC), to gain further insight from alerts, threats and vulnerabilities. We prese… ▽ More Many public sources of cyber threat and vulnerability information exist to help defend cyber systems. This paper links MITRE's ATT&CK MATRIX of Tactics and Techniques, NIST's Common Weakness Enumerations (CWE), Common Vulnerabilities and Exposures (CVE), and Common Attack Pattern Enumeration and Classification list (CAPEC), to gain further insight from alerts, threats and vulnerabilities. We preserve all entries and relations of the sources, while enabling bi-directional, relational path tracing within an aggregate data graph called BRON. In one example, we use BRON to enhance the information derived from a list of the top 10 most frequently exploited CVEs. We identify attack patterns, tactics, and techniques that exploit these CVEs and also uncover a disparity in how much linked information exists for each of these CVEs. This prompts us to further inventory BRON's collection of sources to provide a view of the extent and range of the coverage and blind spots of public data sources. △ Less

Submitted 10 February, 2021; v1 submitted 1 October, 2020; originally announced October 2020.

Comments: 13 pages, 12 figures

arXiv:2008.01124 [pdf, other]

Analyzing the Components of Distributed Coevolutionary GAN Training

Authors: Jamal Toutouh, Erik Hemberg, Una-May O'Reilly

Abstract: Distributed coevolutionary Generative Adversarial Network (GAN) training has empirically shown success in overcoming GAN training pathologies. This is mainly due to diversity maintenance in the populations of generators and discriminators during the training process. The method studied here coevolves sub-populations on each cell of a spatial grid organized into overlap** Moore neighborhoods. We… ▽ More Distributed coevolutionary Generative Adversarial Network (GAN) training has empirically shown success in overcoming GAN training pathologies. This is mainly due to diversity maintenance in the populations of generators and discriminators during the training process. The method studied here coevolves sub-populations on each cell of a spatial grid organized into overlap** Moore neighborhoods. We investigate the impact on the performance of two algorithm components that influence the diversity during coevolution: the performance-based selection/replacement inside each sub-population and the communication through migration of solutions (networks) among overlap** neighborhoods. In experiments on MNIST dataset, we find that the combination of these two components provides the best generative models. In addition, migrating solutions without applying selection in the sub-populations achieves competitive results, while selection without communication between cells reduces performance. △ Less

Submitted 3 August, 2020; originally announced August 2020.

Comments: Accepted as a full paper in Sixteenth International Conference on Parallel Problem Solving from Nature (PPSN XVI)

arXiv:2004.04647 [pdf, other]

doi 10.1007/s10710-020-09389-y

Adversarial Genetic Programming for Cyber Security: A Rising Application Domain Where GP Matters

Authors: Una-May O'Reilly, Jamal Toutouh, Marcos Pertierra, Daniel Prado Sanchez, Dennis Garcia, Anthony Erb Luogo, Jonathan Kelly, Erik Hemberg

Abstract: Cyber security adversaries and engagements are ubiquitous and ceaseless. We delineate Adversarial Genetic Programming for Cyber Security, a research topic that, by means of genetic programming (GP), replicates and studies the behavior of cyber adversaries and the dynamics of their engagements. Adversarial Genetic Programming for Cyber Security encompasses extant and immediate research efforts in a… ▽ More Cyber security adversaries and engagements are ubiquitous and ceaseless. We delineate Adversarial Genetic Programming for Cyber Security, a research topic that, by means of genetic programming (GP), replicates and studies the behavior of cyber adversaries and the dynamics of their engagements. Adversarial Genetic Programming for Cyber Security encompasses extant and immediate research efforts in a vital problem domain, arguably occupying a position at the frontier where GP matters. Additionally, it prompts research questions around evolving complex behavior by expressing different abstractions with GP and opportunities to reconnect to the Machine Learning, Artificial Life, Agent-Based Modeling and Cyber Security communities. We present a framework called RIVALS which supports the study of network security arms races. Its goal is to elucidate the dynamics of cyber networks under attack by computationally modeling and simulating them. △ Less

Submitted 6 April, 2020; originally announced April 2020.

arXiv:2004.04642 [pdf, other]

Data Dieting in GAN Training

Authors: Jamal Toutouh, Una-May O'Reilly, Erik Hemberg

Abstract: We investigate training Generative Adversarial Networks, GANs, with less data. Subsets of the training dataset can express empirical sample diversity while reducing training resource requirements, e.g. time and memory. We ask how much data reduction impacts generator performance and gauge the additive value of generator ensembles. In addition to considering stand-alone GAN training and ensembles o… ▽ More We investigate training Generative Adversarial Networks, GANs, with less data. Subsets of the training dataset can express empirical sample diversity while reducing training resource requirements, e.g. time and memory. We ask how much data reduction impacts generator performance and gauge the additive value of generator ensembles. In addition to considering stand-alone GAN training and ensembles of generator models, we also consider reduced data training on an evolutionary GAN training framework named Redux-Lipizzaner. Redux-Lipizzaner makes GAN training more robust and accurate by exploiting overlap** neighborhood-based training on a spatial 2D grid. We conduct empirical experiments on Redux-Lipizzaner using the MNIST and CelebA data sets. △ Less

Submitted 6 April, 2020; originally announced April 2020.

Comments: Chapter 14 of the Book "Deep Neural Evolution - Deep Learning with Evolutionary Computation"

arXiv:2004.04633 [pdf, other]

Parallel/distributed implementation of cellular training for generative adversarial neural networks

Authors: Emiliano Perez, Sergio Nesmachnow, Jamal Toutouh, Erik Hemberg, Una-May O'Reilly

Abstract: Generative adversarial networks (GANs) are widely used to learn generative models. GANs consist of two networks, a generator and a discriminator, that apply adversarial learning to optimize their parameters. This article presents a parallel/distributed implementation of a cellular competitive coevolutionary method to train two populations of GANs. A distributed memory parallel implementation is pr… ▽ More Generative adversarial networks (GANs) are widely used to learn generative models. GANs consist of two networks, a generator and a discriminator, that apply adversarial learning to optimize their parameters. This article presents a parallel/distributed implementation of a cellular competitive coevolutionary method to train two populations of GANs. A distributed memory parallel implementation is proposed for execution in high performance/supercomputing centers. Efficient results are reported on addressing the generation of handwritten digits (MNIST dataset samples). Moreover, the proposed implementation is able to reduce the training times and scale properly when considering different grid sizes for training. △ Less

Submitted 3 August, 2020; v1 submitted 7 April, 2020; originally announced April 2020.

Comments: This article has been accepted for publication in IEEE International Parallel and Distributed Processing Symposium, Parallel and Distributed Combinatorics and Optimization, 2020

arXiv:2003.13532 [pdf, other]

doi 10.1145/3377930.3390229

Re-purposing Heterogeneous Generative Ensembles with Evolutionary Computation

Authors: Jamal Toutouh, Erik Hemberg, Una-May O'Reilly

Abstract: Generative Adversarial Networks (GANs) are popular tools for generative modeling. The dynamics of their adversarial learning give rise to convergence pathologies during training such as mode and discriminator collapse. In machine learning, ensembles of predictors demonstrate better results than a single predictor for many tasks. In this study, we apply two evolutionary algorithms (EAs) to create e… ▽ More Generative Adversarial Networks (GANs) are popular tools for generative modeling. The dynamics of their adversarial learning give rise to convergence pathologies during training such as mode and discriminator collapse. In machine learning, ensembles of predictors demonstrate better results than a single predictor for many tasks. In this study, we apply two evolutionary algorithms (EAs) to create ensembles to re-purpose generative models, i.e., given a set of heterogeneous generators that were optimized for one objective (e.g., minimize Frechet Inception Distance), create ensembles of them for optimizing a different objective (e.g., maximize the diversity of the generated samples). The first method is restricted by the exact size of the ensemble and the second method only restricts the upper bound of the ensemble size. Experimental analysis on the MNIST image benchmark demonstrates that both EA ensembles creation methods can re-purpose the models, without reducing their original functionality. The EA-based demonstrate significantly better performance compared to other heuristic-based methods. When comparing both evolutionary, the one with only an upper size bound on the ensemble size is the best. △ Less

Submitted 3 August, 2020; v1 submitted 30 March, 2020; originally announced March 2020.

Comments: Accepted as a full paper for the Genetic and Evolutionary Computation Conference - GECCO'20

arXiv:1905.12702 [pdf, other]

doi 10.1145/3321707.3321860

Spatial Evolutionary Generative Adversarial Networks

Authors: Jamal Toutouh, Erik Hemberg, Una-May O'Reilly

Abstract: Generative adversary networks (GANs) suffer from training pathologies such as instability and mode collapse. These pathologies mainly arise from a lack of diversity in their adversarial interactions. Evolutionary generative adversarial networks apply the principles of evolutionary computation to mitigate these problems. We hybridize two of these approaches that promote training diversity. One, E-G… ▽ More Generative adversary networks (GANs) suffer from training pathologies such as instability and mode collapse. These pathologies mainly arise from a lack of diversity in their adversarial interactions. Evolutionary generative adversarial networks apply the principles of evolutionary computation to mitigate these problems. We hybridize two of these approaches that promote training diversity. One, E-GAN, at each batch, injects mutation diversity by training the (replicated) generator with three independent objective functions then selecting the resulting best performing generator for the next batch. The other, Lipizzaner, injects population diversity by training a two-dimensional grid of GANs with a distributed evolutionary algorithm that includes neighbor exchanges of additional training adversaries, performance based selection and population-based hyper-parameter tuning. We propose to combine mutation and population approaches to diversity improvement. We contribute a superior evolutionary GANs training method, Mustangs, that eliminates the single loss function used across Lipizzaner's grid. Instead, each training round, a loss function is selected with equal probability, from among the three E-GAN uses. Experimental analyses on standard benchmarks, MNIST and CelebA, demonstrate that Mustangs provides a statistically faster training method resulting in more accurate networks. △ Less

Submitted 29 May, 2019; originally announced May 2019.

arXiv:1812.05767 [pdf, other]

Using Detailed Access Trajectories for Learning Behavior Analysis

Authors: Yanbang Wang, Nancy Law, Erik Hemberg, Una-May O'Reilly

Abstract: Student learning activity in MOOCs can be viewed from multiple perspectives. We present a new organization of MOOC learner activity data at a resolution that is in between the fine granularity of the clickstream and coarse organizations that count activities, aggregate students or use long duration time units. A detailed access trajectory (DAT) consists of binary values and is two dimensional with… ▽ More Student learning activity in MOOCs can be viewed from multiple perspectives. We present a new organization of MOOC learner activity data at a resolution that is in between the fine granularity of the clickstream and coarse organizations that count activities, aggregate students or use long duration time units. A detailed access trajectory (DAT) consists of binary values and is two dimensional with one axis that is a time series, e.g. days and the other that is a chronologically ordered list of a MOOC component type's instances, e.g. videos in instructional order. Most popular MOOC platforms generate data that can be organized as detailed access trajectories (DATs).We explore the value of DATs by conducting four empirical mini-studies. Our studies suggest DATs contain rich information about students' learning behaviors and facilitate MOOC learning analyses. △ Less

Submitted 13 December, 2018; originally announced December 2018.

Comments: 10 pages, accepted by 2019 International Conference on Learning Analytics and Knowledge

arXiv:1812.05043 [pdf, other]

Transfer Learning using Representation Learning in Massive Open Online Courses

Authors: Mucong Ding, Yanbang Wang, Erik Hemberg, Una-May O'Reilly

Abstract: In a Massive Open Online Course (MOOC), predictive models of student behavior can support multiple aspects of learning, including instructor feedback and timely intervention. Ongoing courses, when the student outcomes are yet unknown, must rely on models trained from the historical data of previously offered courses. It is possible to transfer models, but they often have poor prediction performanc… ▽ More In a Massive Open Online Course (MOOC), predictive models of student behavior can support multiple aspects of learning, including instructor feedback and timely intervention. Ongoing courses, when the student outcomes are yet unknown, must rely on models trained from the historical data of previously offered courses. It is possible to transfer models, but they often have poor prediction performance. One reason is features that inadequately represent predictive attributes common to both courses. We present an automated transductive transfer learning approach that addresses this issue. It relies on problem-agnostic, temporal organization of the MOOC clickstream data, where, for each student, for multiple courses, a set of specific MOOC event types is expressed for each time unit. It consists of two alternative transfer methods based on representation learning with auto-encoders: a passive approach using transductive principal component analysis and an active approach that uses a correlation alignment loss term. With these methods, we investigate the transferability of dropout prediction across similar and dissimilar MOOCs and compare with known methods. Results show improved model transferability and suggest that the methods are capable of automatically learning a feature representation that expresses common predictive characteristics of MOOCs. △ Less

Submitted 18 December, 2018; v1 submitted 12 December, 2018; originally announced December 2018.

Comments: 10 pages, 11 figures, accepted at LAK'19

arXiv:1811.12843 [pdf, other]

Lipizzaner: A System That Scales Robust Generative Adversarial Network Training

Authors: Tom Schmiedlechner, Ignavier Ng Zhi Yong, Abdullah Al-Dujaili, Erik Hemberg, Una-May O'Reilly

Abstract: GANs are difficult to train due to convergence pathologies such as mode and discriminator collapse. We introduce Lipizzaner, an open source software system that allows machine learning engineers to train GANs in a distributed and robust way. Lipizzaner distributes a competitive coevolutionary algorithm which, by virtue of dual, adapting, generator and discriminator populations, is robust to collap… ▽ More GANs are difficult to train due to convergence pathologies such as mode and discriminator collapse. We introduce Lipizzaner, an open source software system that allows machine learning engineers to train GANs in a distributed and robust way. Lipizzaner distributes a competitive coevolutionary algorithm which, by virtue of dual, adapting, generator and discriminator populations, is robust to collapses. The algorithm is well suited to efficient distribution because it uses a spatial grid abstraction. Training is local to each cell and strong intermediate training results are exchanged among overlap** neighborhoods allowing high performing solutions to propagate and improve with more rounds of training. Experiments on common image datasets overcome critical collapses. Communication overhead scales linearly when increasing the number of compute instances and we observe that increasing scale leads to improved model performance. △ Less

Submitted 30 November, 2018; originally announced November 2018.

Comments: Systems for ML Workshop (MLSYS) at NeurIPS 2018

arXiv:1807.08194 [pdf, other]

Towards Distributed Coevolutionary GANs

Authors: Abdullah Al-Dujaili, Tom Schmiedlechner, and Erik Hemberg, Una-May O'Reilly

Abstract: Generative Adversarial Networks (GANs) have become one of the dominant methods for deep generative modeling. Despite their demonstrated success on multiple vision tasks, GANs are difficult to train and much research has been dedicated towards understanding and improving their gradient-based learning dynamics. Here, we investigate the use of coevolution, a class of black-box (gradient-free) co-opti… ▽ More Generative Adversarial Networks (GANs) have become one of the dominant methods for deep generative modeling. Despite their demonstrated success on multiple vision tasks, GANs are difficult to train and much research has been dedicated towards understanding and improving their gradient-based learning dynamics. Here, we investigate the use of coevolution, a class of black-box (gradient-free) co-optimization techniques and a powerful tool in evolutionary computing, as a supplement to gradient-based GAN training techniques. Experiments on a simple model that exhibits several of the GAN gradient-based dynamics (e.g., mode collapse, oscillatory behavior, and vanishing gradients) show that coevolution is a promising framework for esca** degenerate GAN training behaviors. △ Less

Submitted 31 August, 2018; v1 submitted 21 July, 2018; originally announced July 2018.

Comments: Accepted at AAAI 2018 Fall Symposium Series

arXiv:1805.03553 [pdf, other]

On Visual Hallmarks of Robustness to Adversarial Malware

Authors: Alex Huang, Abdullah Al-Dujaili, Erik Hemberg, Una-May O'Reilly

Abstract: A central challenge of adversarial learning is to interpret the resulting hardened model. In this contribution, we ask how robust generalization can be visually discerned and whether a concise view of the interactions between a hardened decision map and input samples is possible. We first provide a means of visually comparing a hardened model's loss behavior with respect to the adversarial variant… ▽ More A central challenge of adversarial learning is to interpret the resulting hardened model. In this contribution, we ask how robust generalization can be visually discerned and whether a concise view of the interactions between a hardened decision map and input samples is possible. We first provide a means of visually comparing a hardened model's loss behavior with respect to the adversarial variants generated during training versus loss behavior with respect to adversarial variants generated from other sources. This allows us to confirm that the association of observed flatness of a loss landscape with generalization that is seen with naturally trained models extends to adversarially hardened models and robust generalization. To complement these means of interpreting model parameter robustness we also use self-organizing maps to provide a visual means of superimposing adversarial and natural variants on a model's decision space, thus allowing the model's global robustness to be comprehensively examined. △ Less

Submitted 9 May, 2018; originally announced May 2018.

Comments: Submitted to the IReDLiA workshop at the Federated Artificial Intelligence Meeting (FAIM) 2018

arXiv:1804.10586 [pdf, other]

Approximating Nash Equilibria for Black-Box Games: A Bayesian Optimization Approach

Authors: Abdullah Al-Dujaili, Erik Hemberg, Una-May O'Reilly

Abstract: Game theory has emerged as a powerful framework for modeling a large range of multi-agent scenarios. Many algorithmic solutions require discrete, finite games with payoffs that have a closed-form specification. In contrast, many real-world applications require modeling with continuous action spaces and black-box utility functions where payoff information is available only in the form of empirical… ▽ More Game theory has emerged as a powerful framework for modeling a large range of multi-agent scenarios. Many algorithmic solutions require discrete, finite games with payoffs that have a closed-form specification. In contrast, many real-world applications require modeling with continuous action spaces and black-box utility functions where payoff information is available only in the form of empirical (often expensive and/or noisy) observations of strategy profiles. To the best of our knowledge, few tools exist for solving the class of expensive, black-box continuous games. In this paper, we develop a method to find equilibria for such games in a sequential decision-making framework using Bayesian Optimization. The proposed approach is validated on a collection of synthetic game problems with varying degree of noise and action space dimensions. The results indicate that it is capable of improving the maximum regret in noisy and high dimensions to a greater extent than hierarchical or discretized methods. △ Less

Submitted 11 June, 2018; v1 submitted 27 April, 2018; originally announced April 2018.

Comments: Accepted at OptMAS@AAMAS'18

arXiv:1801.02950 [pdf, other]

Adversarial Deep Learning for Robust Detection of Binary Encoded Malware

Authors: Abdullah Al-Dujaili, Alex Huang, Erik Hemberg, Una-May O'Reilly

Abstract: Malware is constantly adapting in order to avoid detection. Model based malware detectors, such as SVM and neural networks, are vulnerable to so-called adversarial examples which are modest changes to detectable malware that allows the resulting malware to evade detection. Continuous-valued methods that are robust to adversarial examples of images have been developed using saddle-point optimizatio… ▽ More Malware is constantly adapting in order to avoid detection. Model based malware detectors, such as SVM and neural networks, are vulnerable to so-called adversarial examples which are modest changes to detectable malware that allows the resulting malware to evade detection. Continuous-valued methods that are robust to adversarial examples of images have been developed using saddle-point optimization formulations. We are inspired by them to develop similar methods for the discrete, e.g. binary, domain which characterizes the features of malware. A specific extra challenge of malware is that the adversarial examples must be generated in a way that preserves their malicious functionality. We introduce methods capable of generating functionally preserved adversarial malware examples in the binary domain. Using the saddle-point formulation, we incorporate the adversarial examples into the training of models that are robust to them. We evaluate the effectiveness of the methods and others in the literature on a set of Portable Execution~(PE) files. Comparison prompts our introduction of an online measure computed during training to assess general expectation of robustness. △ Less

Submitted 25 March, 2018; v1 submitted 9 January, 2018; originally announced January 2018.

Comments: 1ST Deep Learning and Security Workshop (co-located with the 39th IEEE Symposium on Security and Privacy)

arXiv:1712.00206 [pdf, other]

Distributed Stratified Locality Sensitive Hashing for Critical Event Prediction in the Cloud

Authors: Alessandro De Palma, Erik Hemberg, Una-May O'Reilly

Abstract: The availability of massive healthcare data repositories calls for efficient tools for data-driven medicine. We introduce a distributed system for Stratified Locality Sensitive Hashing to perform fast similarity-based prediction on large medical waveform datasets. Our implementation, for an ICU use case, prioritizes latency over throughput and is targeted at a cloud environment. We demonstrate our… ▽ More The availability of massive healthcare data repositories calls for efficient tools for data-driven medicine. We introduce a distributed system for Stratified Locality Sensitive Hashing to perform fast similarity-based prediction on large medical waveform datasets. Our implementation, for an ICU use case, prioritizes latency over throughput and is targeted at a cloud environment. We demonstrate our system on Acute Hypotensive Episode prediction from Arterial Blood Pressure waveforms. On a dataset of $1.37$ million points, we show scaling up to $40$ processors and a $21\times$ speedup in number of comparisons to parallel exhaustive search at the price of a $10\%$ Matthews correlation coefficient (MCC) loss. Furthermore, if additional MCC loss can be tolerated, our system achieves speedups up to two orders of magnitude. △ Less

Submitted 1 December, 2017; originally announced December 2017.

Comments: Accepted poster at NIPS 2017 Workshop on Machine Learning for Health (https://ml4health.github.io/2017/)

arXiv:1703.08535 [pdf, other]

doi 10.1145/3067695.3082469

PonyGE2: Grammatical Evolution in Python

Authors: Michael Fenton, James McDermott, David Fagan, Stefan Forstenlechner, Michael O'Neill, Erik Hemberg

Abstract: Grammatical Evolution (GE) is a population-based evolutionary algorithm, where a formal grammar is used in the genotype to phenotype map** process. PonyGE2 is an open source implementation of GE in Python, developed at UCD's Natural Computing Research and Applications group. It is intended as an advertisement and a starting-point for those new to GE, a reference for students and researchers, a r… ▽ More Grammatical Evolution (GE) is a population-based evolutionary algorithm, where a formal grammar is used in the genotype to phenotype map** process. PonyGE2 is an open source implementation of GE in Python, developed at UCD's Natural Computing Research and Applications group. It is intended as an advertisement and a starting-point for those new to GE, a reference for students and researchers, a rapid-prototy** medium for our own experiments, and a Python workout. As well as providing the characteristic genotype to phenotype map** of GE, a search algorithm engine is also provided. A number of sample problems and tutorials on how to use and adapt PonyGE2 have been developed. △ Less

Submitted 26 April, 2017; v1 submitted 24 March, 2017; originally announced March 2017.

Comments: 8 pages, 4 figures, submitted to the 2017 GECCO Workshop on Evolutionary Computation Software Systems (EvoSoft)

Journal ref: In Proceedings of GECCO '17 Companion, Berlin, Germany, July 15-19, 2017, 8 pages

Showing 1–24 of 24 results for author: Hemberg, E