-
Effective Adaptive Mutation Rates for Program Synthesis
Authors:
Andrew Ni,
Lee Spector
Abstract:
The problem-solving performance of many evolutionary algorithms, including genetic programming systems used for program synthesis, depends on the values of hyperparameters including mutation rates. The mutation method used to produce some of the best results to date on software synthesis benchmark problems, Uniform Mutation by Addition and Deletion (UMAD), adds new genes into a genome at a predete…
▽ More
The problem-solving performance of many evolutionary algorithms, including genetic programming systems used for program synthesis, depends on the values of hyperparameters including mutation rates. The mutation method used to produce some of the best results to date on software synthesis benchmark problems, Uniform Mutation by Addition and Deletion (UMAD), adds new genes into a genome at a predetermined rate and then deletes genes at a rate that balances the addition rate, producing no size change on average. While UMAD with a predetermined addition rate outperforms many other mutation and crossover schemes, we do not expect a single rate to be optimal across all problems or all generations within one run of an evolutionary system. However, many current adaptive mutation schemes such as self-adaptive mutation rates suffer from pathologies like the vanishing mutation rate problem, in which the mutation rate quickly decays to zero. We propose an adaptive bandit-based scheme that addresses this problem and essentially removes the need to specify a mutation rate. Although the proposed scheme itself introduces hyperparameters, we either set these to good values or ensemble them in a reasonable range. Results on software synthesis and symbolic regression problems validate the effectiveness of our approach.
△ Less
Submitted 22 June, 2024;
originally announced June 2024.
-
Pareto-Optimal Learning from Preferences with Hidden Context
Authors:
Ryan Boldi,
Li Ding,
Lee Spector,
Scott Niekum
Abstract:
Ensuring AI models align with human values is essential for their safety and functionality. Reinforcement learning from human feedback (RLHF) uses human preferences to achieve this alignment. However, preferences sourced from diverse populations can result in point estimates of human values that may be sub-optimal or unfair to specific groups. We propose Pareto Optimal Preference Learning (POPL),…
▽ More
Ensuring AI models align with human values is essential for their safety and functionality. Reinforcement learning from human feedback (RLHF) uses human preferences to achieve this alignment. However, preferences sourced from diverse populations can result in point estimates of human values that may be sub-optimal or unfair to specific groups. We propose Pareto Optimal Preference Learning (POPL), which frames discrepant group preferences as objectives with potential trade-offs, aiming for policies that are Pareto-optimal on the preference dataset. POPL utilizes Lexicase selection, an iterative process to select diverse and Pareto-optimal solutions. Our empirical evaluations demonstrate that POPL surpasses baseline methods in learning sets of reward functions, effectively catering to distinct groups without access to group numbers or membership labels. Furthermore, we illustrate that POPL can serve as a foundation for techniques optimizing specific notions of group fairness, ensuring inclusive and equitable AI model alignment.
△ Less
Submitted 21 June, 2024;
originally announced June 2024.
-
Leveraging Symbolic Regression for Heuristic Design in the Traveling Thief Problem
Authors:
Andrew Ni,
Lee Spector
Abstract:
The Traveling Thief Problem is an NP-hard combination of the well known traveling salesman and knapsack packing problems. In this paper, we use symbolic regression to learn useful features of near-optimal packing plans, which we then use to design efficient metaheuristic genetic algorithms for the traveling thief algorithm. By using symbolic regression again to initialize the metaheuristic GA with…
▽ More
The Traveling Thief Problem is an NP-hard combination of the well known traveling salesman and knapsack packing problems. In this paper, we use symbolic regression to learn useful features of near-optimal packing plans, which we then use to design efficient metaheuristic genetic algorithms for the traveling thief algorithm. By using symbolic regression again to initialize the metaheuristic GA with near-optimal individuals, we are able to design a fast, interpretable, and effective packing initialization scheme. Comparisons against previous initialization schemes validates our algorithm design.
△ Less
Submitted 19 April, 2024;
originally announced April 2024.
-
DALex: Lexicase-like Selection via Diverse Aggregation
Authors:
Andrew Ni,
Li Ding,
Lee Spector
Abstract:
Lexicase selection has been shown to provide advantages over other selection algorithms in several areas of evolutionary computation and machine learning. In its standard form, lexicase selection filters a population or other collection based on randomly ordered training cases that are considered one at a time. This iterated filtering process can be time-consuming, particularly in settings with la…
▽ More
Lexicase selection has been shown to provide advantages over other selection algorithms in several areas of evolutionary computation and machine learning. In its standard form, lexicase selection filters a population or other collection based on randomly ordered training cases that are considered one at a time. This iterated filtering process can be time-consuming, particularly in settings with large numbers of training cases. In this paper, we propose a new method that is nearly equivalent to lexicase selection in terms of the individuals that it selects, but which does so significantly more quickly. The new method, called DALex (for Diversely Aggregated Lexicase), selects the best individual with respect to a weighted sum of training case errors, where the weights are randomly sampled. This allows us to formulate the core computation required for selection as matrix multiplication instead of recursive loops of comparisons, which in turn allows us to take advantage of optimized and parallel algorithms designed for matrix multiplication for speedup. Furthermore, we show that we can interpolate between the behavior of lexicase selection and its "relaxed" variants, such as epsilon or batch lexicase selection, by adjusting a single hyperparameter, named "particularity pressure," which represents the importance granted to each individual training case. Results on program synthesis, deep learning, symbolic regression, and learning classifier systems demonstrate that DALex achieves significant speedups over lexicase selection and its relaxed variants while maintaining almost identical problem-solving performance. Under a fixed computational budget, these savings free up resources that can be directed towards increasing population size or the number of generations, enabling the potential for solving more difficult problems.
△ Less
Submitted 8 February, 2024; v1 submitted 22 January, 2024;
originally announced January 2024.
-
Optimizing Neural Networks with Gradient Lexicase Selection
Authors:
Li Ding,
Lee Spector
Abstract:
One potential drawback of using aggregated performance measurement in machine learning is that models may learn to accept higher errors on some training cases as compromises for lower errors on others, with the lower errors actually being instances of overfitting. This can lead to both stagnation at local optima and poor generalization. Lexicase selection is an uncompromising method developed in e…
▽ More
One potential drawback of using aggregated performance measurement in machine learning is that models may learn to accept higher errors on some training cases as compromises for lower errors on others, with the lower errors actually being instances of overfitting. This can lead to both stagnation at local optima and poor generalization. Lexicase selection is an uncompromising method developed in evolutionary computation, which selects models on the basis of sequences of individual training case errors instead of using aggregated metrics such as loss and accuracy. In this paper, we investigate how lexicase selection, in its general form, can be integrated into the context of deep learning to enhance generalization. We propose Gradient Lexicase Selection, an optimization framework that combines gradient descent and lexicase selection in an evolutionary fashion. Our experimental results demonstrate that the proposed method improves the generalization performance of various widely-used deep neural network architectures across three image classification benchmarks. Additionally, qualitative analysis suggests that our method assists networks in learning more diverse representations. Our source code is available on GitHub: https://github.com/ld-ing/gradient-lexicase.
△ Less
Submitted 19 December, 2023;
originally announced December 2023.
-
Objectives Are All You Need: Solving Deceptive Problems Without Explicit Diversity Maintenance
Authors:
Ryan Boldi,
Li Ding,
Lee Spector
Abstract:
Navigating deceptive domains has often been a challenge in machine learning due to search algorithms getting stuck at sub-optimal local optima. Many algorithms have been proposed to navigate these domains by explicitly maintaining diversity or equivalently promoting exploration, such as Novelty Search or other so-called Quality Diversity algorithms. In this paper, we present an approach with promi…
▽ More
Navigating deceptive domains has often been a challenge in machine learning due to search algorithms getting stuck at sub-optimal local optima. Many algorithms have been proposed to navigate these domains by explicitly maintaining diversity or equivalently promoting exploration, such as Novelty Search or other so-called Quality Diversity algorithms. In this paper, we present an approach with promise to solve deceptive domains without explicit diversity maintenance by optimizing a potentially large set of defined objectives. These objectives can be extracted directly from the environment by sub-aggregating the raw performance of individuals in a variety of ways. We use lexicase selection to optimize for these objectives as it has been shown to implicitly maintain population diversity. We compare this technique with a varying number of objectives to a commonly used quality diversity algorithm, MAP-Elites, on a set of discrete optimization as well as reinforcement learning domains with varying degrees of deception. We find that decomposing objectives into many objectives and optimizing them outperforms MAP-Elites on the deceptive domains that we explore. Furthermore, we find that this technique results in competitive performance on the diversity-focused metrics of QD-Score and Coverage, without explicitly optimizing for these things. Our ablation study shows that this technique is robust to different subaggregation techniques. However, when it comes to non-deceptive, or ``illumination" domains, quality diversity techniques generally outperform our objective-based framework with respect to exploration (but not exploitation), hinting at potential directions for future work.
△ Less
Submitted 3 November, 2023;
originally announced November 2023.
-
Quality Diversity through Human Feedback: Towards Open-Ended Diversity-Driven Optimization
Authors:
Li Ding,
Jenny Zhang,
Jeff Clune,
Lee Spector,
Joel Lehman
Abstract:
Reinforcement Learning from Human Feedback (RLHF) has shown potential in qualitative tasks where easily defined performance measures are lacking. However, there are drawbacks when RLHF is commonly used to optimize for average human preferences, especially in generative tasks that demand diverse model responses. Meanwhile, Quality Diversity (QD) algorithms excel at identifying diverse and high-qual…
▽ More
Reinforcement Learning from Human Feedback (RLHF) has shown potential in qualitative tasks where easily defined performance measures are lacking. However, there are drawbacks when RLHF is commonly used to optimize for average human preferences, especially in generative tasks that demand diverse model responses. Meanwhile, Quality Diversity (QD) algorithms excel at identifying diverse and high-quality solutions but often rely on manually crafted diversity metrics. This paper introduces Quality Diversity through Human Feedback (QDHF), a novel approach that progressively infers diversity metrics from human judgments of similarity among solutions, thereby enhancing the applicability and effectiveness of QD algorithms in complex and open-ended domains. Empirical studies show that QDHF significantly outperforms state-of-the-art methods in automatic diversity discovery and matches the efficacy of QD with manually crafted diversity metrics on standard benchmarks in robotics and reinforcement learning. Notably, in open-ended generative tasks, QDHF substantially enhances the diversity of text-to-image generation from a diffusion model and is more favorably received in user studies. We conclude by analyzing QDHF's scalability, robustness, and quality of derived diversity metrics, emphasizing its strength in open-ended optimization tasks. Code and tutorials are available at https://liding.info/qdhf.
△ Less
Submitted 4 June, 2024; v1 submitted 18 October, 2023;
originally announced October 2023.
-
Constructor algorithms for building unconventional computers able to solve NP-complete problems
Authors:
Tony McCaffrey,
Thomas E. Gorochowski,
Lee Spector
Abstract:
Nature often builds physical structures tailored for specific information processing tasks with computations encoded using diverse phenomena. These can sometimes outperform typical general-purpose computers. However, describing the construction and function of these unconventional computers is often challenging. Here, we address this by introducing constructor algorithms in the context of a roboti…
▽ More
Nature often builds physical structures tailored for specific information processing tasks with computations encoded using diverse phenomena. These can sometimes outperform typical general-purpose computers. However, describing the construction and function of these unconventional computers is often challenging. Here, we address this by introducing constructor algorithms in the context of a robotic wire machine that can be programmed to build networks of connected wires in response to a problem and then act upon these to efficiently carry out a desired computation. We show how this approach can be used to solve the NP-complete Subset Sum Problem (SSP) and provide information about the number of solutions through changes in the voltages and currents measured across these networks. This work provides a foundation for building unconventional computers that encode information purely in the lengths and connections of electrically conductive wires. It also demonstrates the power of computing paradigms beyond digital logic and opens avenues to more fully harness the inherent computational capabilities of diverse physical, chemical and biological substrates.
△ Less
Submitted 18 September, 2023;
originally announced September 2023.
-
Particularity
Authors:
Lee Spector,
Li Ding,
Ryan Boldi
Abstract:
We describe a design principle for adaptive systems under which adaptation is driven by particular challenges that the environment poses, as opposed to average or otherwise aggregated measures of performance over many challenges. We trace the development of this "particularity" approach from the use of lexicase selection in genetic programming to "particularist" approaches to other forms of machin…
▽ More
We describe a design principle for adaptive systems under which adaptation is driven by particular challenges that the environment poses, as opposed to average or otherwise aggregated measures of performance over many challenges. We trace the development of this "particularity" approach from the use of lexicase selection in genetic programming to "particularist" approaches to other forms of machine learning and to the design of adaptive systems more generally.
△ Less
Submitted 11 June, 2023;
originally announced June 2023.
-
Probabilistic Lexicase Selection
Authors:
Li Ding,
Edward Pantridge,
Lee Spector
Abstract:
Lexicase selection is a widely used parent selection algorithm in genetic programming, known for its success in various task domains such as program synthesis, symbolic regression, and machine learning. Due to its non-parametric and recursive nature, calculating the probability of each individual being selected by lexicase selection has been proven to be an NP-hard problem, which discourages deepe…
▽ More
Lexicase selection is a widely used parent selection algorithm in genetic programming, known for its success in various task domains such as program synthesis, symbolic regression, and machine learning. Due to its non-parametric and recursive nature, calculating the probability of each individual being selected by lexicase selection has been proven to be an NP-hard problem, which discourages deeper theoretical understanding and practical improvements to the algorithm. In this work, we introduce probabilistic lexicase selection (plexicase selection), a novel parent selection algorithm that efficiently approximates the probability distribution of lexicase selection. Our method not only demonstrates superior problem-solving capabilities as a semantic-aware selection method, but also benefits from having a probabilistic representation of the selection process for enhanced efficiency and flexibility. Experiments are conducted in two prevalent domains in genetic programming: program synthesis and symbolic regression, using standard benchmarks including PSB and SRBench. The empirical results show that plexicase selection achieves state-of-the-art problem-solving performance that is competitive to the lexicase selection, and significantly outperforms lexicase selection in computation efficiency.
△ Less
Submitted 19 May, 2023;
originally announced May 2023.
-
Can the Problem-Solving Benefits of Quality Diversity Be Obtained Without Explicit Diversity Maintenance?
Authors:
Ryan Boldi,
Lee Spector
Abstract:
When using Quality Diversity (QD) optimization to solve hard exploration or deceptive search problems, we assume that diversity is extrinsically valuable. This means that diversity is important to help us reach an objective, but is not an objective in itself. Often, in these domains, practitioners benchmark their QD algorithms against single objective optimization frameworks. In this paper, we arg…
▽ More
When using Quality Diversity (QD) optimization to solve hard exploration or deceptive search problems, we assume that diversity is extrinsically valuable. This means that diversity is important to help us reach an objective, but is not an objective in itself. Often, in these domains, practitioners benchmark their QD algorithms against single objective optimization frameworks. In this paper, we argue that the correct comparison should be made to \emph{multi-objective} optimization frameworks. This is because single objective optimization frameworks rely on the aggregation of sub-objectives, which could result in decreased information that is crucial for maintaining diverse populations automatically. In order to facilitate a fair comparison between quality diversity and multi-objective optimization, we present a method that utilizes dimensionality reduction to automatically determine a set of behavioral descriptors for an individual, as well as a set of objectives for an individual to solve. Using the former, one can generate solutions using standard quality diversity optimization techniques, and using the latter, one can generate solutions using standard multi-objective optimization techniques. This allows for a level comparison between these two classes of algorithms, without requiring domain and algorithm specific modifications to facilitate a comparison.
△ Less
Submitted 12 May, 2023;
originally announced May 2023.
-
Analyzing the Interaction Between Down-Sampling and Selection
Authors:
Ryan Boldi,
Ashley Bao,
Martin Briesch,
Thomas Helmuth,
Dominik Sobania,
Lee Spector,
Alexander Lale**i
Abstract:
Genetic programming systems often use large training sets to evaluate the quality of candidate solutions for selection. However, evaluating populations on large training sets can be computationally expensive. Down-sampling training sets has long been used to decrease the computational cost of evaluation in a wide range of application domains. Indeed, recent studies have shown that both random and…
▽ More
Genetic programming systems often use large training sets to evaluate the quality of candidate solutions for selection. However, evaluating populations on large training sets can be computationally expensive. Down-sampling training sets has long been used to decrease the computational cost of evaluation in a wide range of application domains. Indeed, recent studies have shown that both random and informed down-sampling can substantially improve problem-solving success for GP systems that use the lexicase parent selection algorithm. We use the PushGP framework to experimentally test whether these down-sampling techniques can also improve problem-solving success in the context of two other commonly used selection methods, fitness-proportionate and tournament selection, across eight GP problems (four program synthesis and four symbolic regression). We verified that down-sampling can benefit the problem-solving success of both fitness-proportionate and tournament selection. However, the number of problems wherein down-sampling improved problem-solving success varied by selection scheme, suggesting that the impact of down-sampling depends both on the problem and choice of selection scheme. Surprisingly, we found that down-sampling was most consistently beneficial when combined with lexicase selection as compared to tournament and fitness-proportionate selection. Overall, our results suggest that down-sampling should be considered more often when solving test-based GP problems.
△ Less
Submitted 14 April, 2023;
originally announced April 2023.
-
A Static Analysis of Informed Down-Samples
Authors:
Ryan Boldi,
Alexander Lale**i,
Thomas Helmuth,
Lee Spector
Abstract:
We present an analysis of the loss of population-level test coverage induced by different down-sampling strategies when combined with lexicase selection. We study recorded populations from the first generation of genetic programming runs, as well as entirely synthetic populations. Our findings verify the hypothesis that informed down-sampling better maintains population-level test coverage when co…
▽ More
We present an analysis of the loss of population-level test coverage induced by different down-sampling strategies when combined with lexicase selection. We study recorded populations from the first generation of genetic programming runs, as well as entirely synthetic populations. Our findings verify the hypothesis that informed down-sampling better maintains population-level test coverage when compared to random down-sampling. Additionally, we show that both forms of down-sampling cause greater test coverage loss than standard lexicase selection with no down-sampling. However, given more information about the population, we found that informed down-sampling can further reduce its test coverage loss. We also recommend wider adoption of the static population analyses we present in this work.
△ Less
Submitted 16 April, 2023; v1 submitted 4 April, 2023;
originally announced April 2023.
-
Informed Down-Sampled Lexicase Selection: Identifying productive training cases for efficient problem solving
Authors:
Ryan Boldi,
Martin Briesch,
Dominik Sobania,
Alexander Lale**i,
Thomas Helmuth,
Franz Rothlauf,
Charles Ofria,
Lee Spector
Abstract:
Genetic Programming (GP) often uses large training sets and requires all individuals to be evaluated on all training cases during selection. Random down-sampled lexicase selection evaluates individuals on only a random subset of the training cases allowing for more individuals to be explored with the same amount of program executions. However, creating a down-sample randomly might exclude importan…
▽ More
Genetic Programming (GP) often uses large training sets and requires all individuals to be evaluated on all training cases during selection. Random down-sampled lexicase selection evaluates individuals on only a random subset of the training cases allowing for more individuals to be explored with the same amount of program executions. However, creating a down-sample randomly might exclude important cases from the current down-sample for a number of generations, while cases that measure the same behavior (synonymous cases) may be overused despite their redundancy. In this work, we introduce Informed Down-Sampled Lexicase Selection. This method leverages population statistics to build down-samples that contain more distinct and therefore informative training cases. Through an empirical investigation across two different GP systems (PushGP and Grammar-Guided GP), we find that informed down-sampling significantly outperforms random down-sampling on a set of contemporary program synthesis benchmark problems. Through an analysis of the created down-samples, we find that important training cases are included in the down-sample consistently across independent evolutionary runs and systems. We hypothesize that this improvement can be attributed to the ability of Informed Down-Sampled Lexicase Selection to maintain more specialist individuals over the course of evolution, while also benefiting from reduced per-evaluation costs.
△ Less
Submitted 22 February, 2024; v1 submitted 4 January, 2023;
originally announced January 2023.
-
Evolutionary Quantum Architecture Search for Parametrized Quantum Circuits
Authors:
Li Ding,
Lee Spector
Abstract:
Recent advancements in quantum computing have shown promising computational advantages in many problem areas. As one of those areas with increasing attention, hybrid quantum-classical machine learning systems have demonstrated the capability to solve various data-driven learning tasks. Recent works show that parameterized quantum circuits (PQCs) can be used to solve challenging reinforcement learn…
▽ More
Recent advancements in quantum computing have shown promising computational advantages in many problem areas. As one of those areas with increasing attention, hybrid quantum-classical machine learning systems have demonstrated the capability to solve various data-driven learning tasks. Recent works show that parameterized quantum circuits (PQCs) can be used to solve challenging reinforcement learning (RL) tasks with provable learning advantages. While existing works yield potentials of PQC-based methods, the design choices of PQC architectures and their influences on the learning tasks are generally underexplored. In this work, we introduce EQAS-PQC, an evolutionary quantum architecture search framework for PQC-based models, which uses a population-based genetic algorithm to evolve PQC architectures by exploring the search space of quantum operations. Experimental results show that our method can significantly improve the performance of hybrid quantum-classical models in solving benchmark reinforcement problems. We also model the probability distributions of quantum operations in top-performing architectures to identify essential design choices that are critical to the performance.
△ Less
Submitted 23 August, 2022;
originally announced August 2022.
-
Lexicase Selection at Scale
Authors:
Li Ding,
Ryan Boldi,
Thomas Helmuth,
Lee Spector
Abstract:
Lexicase selection is a semantic-aware parent selection method, which assesses individual test cases in a randomly-shuffled data stream. It has demonstrated success in multiple research areas including genetic programming, genetic algorithms, and more recently symbolic regression and deep learning. One potential drawback of lexicase selection and its variants is that the selection procedure requir…
▽ More
Lexicase selection is a semantic-aware parent selection method, which assesses individual test cases in a randomly-shuffled data stream. It has demonstrated success in multiple research areas including genetic programming, genetic algorithms, and more recently symbolic regression and deep learning. One potential drawback of lexicase selection and its variants is that the selection procedure requires evaluating training cases in a single data stream, making it difficult to handle tasks where the evaluation is computationally heavy or the dataset is large-scale, e.g., deep learning. In this work, we investigate how the weighted shuffle methods can be employed to improve the efficiency of lexicase selection. We propose a novel method, fast lexicase selection, which incorporates lexicase selection and weighted shuffle with partial evaluation. Experiments on both classic genetic programming and deep learning tasks indicate that the proposed method can significantly reduce the number of evaluation steps needed for lexicase selection to select an individual, improving its efficiency while maintaining the performance.
△ Less
Submitted 22 August, 2022;
originally announced August 2022.
-
Functional Code Building Genetic Programming
Authors:
Edward Pantridge,
Thomas Helmuth,
Lee Spector
Abstract:
General program synthesis has become an important application area for genetic programming (GP), and for artificial intelligence more generally. Code Building Genetic Programming (CBGP) is a recently introduced GP method for general program synthesis that leverages reflection and first class specifications to support the evolution of programs that may use arbitrary data types, polymorphism, and fu…
▽ More
General program synthesis has become an important application area for genetic programming (GP), and for artificial intelligence more generally. Code Building Genetic Programming (CBGP) is a recently introduced GP method for general program synthesis that leverages reflection and first class specifications to support the evolution of programs that may use arbitrary data types, polymorphism, and functions drawn from existing codebases. However, neither a formal description nor a thorough benchmarking of CBGP have yet been reported. In this work, we formalize the method of CBGP using algorithms from type theory. Specially, we show that a functional programming language and a Hindley-Milner type system can be used to evolve type-safe programs using the process abstractly described in the original CBGP paper. Furthermore, we perform a comprehensive analysis of the search performance of this functional variant of CBGP compared to other contemporary GP program synthesis methods.
△ Less
Submitted 9 June, 2022;
originally announced June 2022.
-
The Environmental Discontinuity Hypothesis for Down-Sampled Lexicase Selection
Authors:
Ryan Boldi,
Thomas Helmuth,
Lee Spector
Abstract:
Down-sampling training data has long been shown to improve the generalization performance of a wide range of machine learning systems. Recently, down-sampling has proved effective in genetic programming (GP) runs that utilize the lexicase parent selection technique. Although this down-sampling procedure has been shown to significantly improve performance across a variety of problems, it does not s…
▽ More
Down-sampling training data has long been shown to improve the generalization performance of a wide range of machine learning systems. Recently, down-sampling has proved effective in genetic programming (GP) runs that utilize the lexicase parent selection technique. Although this down-sampling procedure has been shown to significantly improve performance across a variety of problems, it does not seem to do so due to encouraging adaptability through environmental change. We hypothesize that the random sampling that is performed every generation causes discontinuities that result in the population being unable to adapt to the shifting environment. We investigate modifications to down-sampled lexicase selection in hopes of promoting incremental environmental change to scaffold evolution by reducing the amount of jarring discontinuities between the environments of successive generations. In our empirical studies, we find that forcing incremental environmental change is not significantly better for evolving solutions to program synthesis problems than simple random down-sampling. In response to this, we attempt to exacerbate the hypothesized prevalence of discontinuities by using only disjoint down-samples to see if it hinders performance. We find that this also does not significantly differ from the performance of regular random down-sampling. These negative results raise new questions about the ways in which the composition of sub-samples, which may include synonymous cases, may be expected to influence the performance of machine learning systems that use down-sampling.
△ Less
Submitted 31 May, 2022;
originally announced May 2022.
-
Evolving Neural Selection with Adaptive Regularization
Authors:
Li Ding,
Lee Spector
Abstract:
Over-parameterization is one of the inherent characteristics of modern deep neural networks, which can often be overcome by leveraging regularization methods, such as Dropout. Usually, these methods are applied globally and all the input cases are treated equally. However, given the natural variation of the input space for real-world tasks such as image recognition and natural language understandi…
▽ More
Over-parameterization is one of the inherent characteristics of modern deep neural networks, which can often be overcome by leveraging regularization methods, such as Dropout. Usually, these methods are applied globally and all the input cases are treated equally. However, given the natural variation of the input space for real-world tasks such as image recognition and natural language understanding, it is unlikely that a fixed regularization pattern will have the same effectiveness for all the input cases. In this work, we demonstrate a method in which the selection of neurons in deep neural networks evolves, adapting to the difficulty of prediction. We propose the Adaptive Neural Selection (ANS) framework, which evolves to weigh neurons in a layer to form network variants that are suitable to handle different input cases. Experimental results show that the proposed method can significantly improve the performance of commonly-used neural network architectures on standard image recognition benchmarks. Ablation studies also validate the effectiveness and contribution of each component in the proposed framework.
△ Less
Submitted 4 April, 2022;
originally announced April 2022.
-
Problem-solving benefits of down-sampled lexicase selection
Authors:
Thomas Helmuth,
Lee Spector
Abstract:
In genetic programming, an evolutionary method for producing computer programs that solve specified computational problems, parent selection is ordinarily based on aggregate measures of performance across an entire training set. Lexicase selection, by contrast, selects on the basis of performance on random sequences of training cases; this has been shown to enhance problem-solving power in many ci…
▽ More
In genetic programming, an evolutionary method for producing computer programs that solve specified computational problems, parent selection is ordinarily based on aggregate measures of performance across an entire training set. Lexicase selection, by contrast, selects on the basis of performance on random sequences of training cases; this has been shown to enhance problem-solving power in many circumstances. Lexicase selection can also be seen as better reflecting biological evolution, by modeling sequences of challenges that organisms face over their lifetimes. Recent work has demonstrated that the advantages of lexicase selection can be amplified by down-sampling, meaning that only a random subsample of the training cases is used each generation. This can be seen as modeling the fact that individual organisms encounter only subsets of the possible environments, and that environments change over time. Here we provide the most extensive benchmarking of down-sampled lexicase selection to date, showing that its benefits hold up to increased scrutiny. The reasons that down-sampling helps, however, are not yet fully understood. Hypotheses include that down-sampling allows for more generations to be processed with the same budget of program evaluations; that the variation of training data across generations acts as a changing environment, encouraging adaptation; or that it reduces overfitting, leading to more general solutions. We systematically evaluate these hypotheses, finding evidence against all three, and instead draw the conclusion that down-sampled lexicase selection's main benefit stems from the fact that it allows the evolutionary process to examine more individuals within the same computational budget, even though each individual is examined less completely.
△ Less
Submitted 10 June, 2021;
originally announced June 2021.
-
Code Building Genetic Programming
Authors:
Edward Pantridge,
Lee Spector
Abstract:
In recent years the field of genetic programming has made significant advances towards automatic programming. Research and development of contemporary program synthesis methods, such as PushGP and Grammar Guided Genetic Programming, can produce programs that solve problems typically assigned in introductory academic settings. These problems focus on a narrow, predetermined set of simple data struc…
▽ More
In recent years the field of genetic programming has made significant advances towards automatic programming. Research and development of contemporary program synthesis methods, such as PushGP and Grammar Guided Genetic Programming, can produce programs that solve problems typically assigned in introductory academic settings. These problems focus on a narrow, predetermined set of simple data structures, basic control flow patterns, and primitive, non-overlap** data types (without, for example, inheritance or composite types). Few, if any, genetic programming methods for program synthesis have convincingly demonstrated the capability of synthesizing programs that use arbitrary data types, data structures, and specifications that are drawn from existing codebases. In this paper, we introduce Code Building Genetic Programming (CBGP) as a framework within which this can be done, by leveraging programming language features such as reflection and first-class specifications. CBGP produces a computational graph that can be executed or translated into source code of a host language. To demonstrate the novel capabilities of CBGP, we present results on new benchmarks that use non-primitive, polymorphic data types as well as some standard program synthesis benchmarks.
△ Less
Submitted 9 August, 2020;
originally announced August 2020.
-
Lexicase selection in Learning Classifier Systems
Authors:
Sneha Aenugu,
Lee Spector
Abstract:
The lexicase parent selection method selects parents by considering performance on individual data points in random order instead of using a fitness function based on an aggregated data accuracy. While the method has demonstrated promise in genetic programming and more recently in genetic algorithms, its applications in other forms of evolutionary machine learning have not been explored. In this p…
▽ More
The lexicase parent selection method selects parents by considering performance on individual data points in random order instead of using a fitness function based on an aggregated data accuracy. While the method has demonstrated promise in genetic programming and more recently in genetic algorithms, its applications in other forms of evolutionary machine learning have not been explored. In this paper, we investigate the use of lexicase parent selection in Learning Classifier Systems (LCS) and study its effect on classification problems in a supervised setting. We further introduce a new variant of lexicase selection, called batch-lexicase selection, which allows for the tuning of selection pressure. We compare the two lexicase selection methods with tournament and fitness proportionate selection methods on binary classification problems. We show that batch-lexicase selection results in the creation of more generic rules which is favorable for generalization on future data. We further show that batch-lexicase selection results in better generalization in situations of partial or missing data.
△ Less
Submitted 10 July, 2019;
originally announced July 2019.
-
Epsilon-Lexicase Selection for Regression
Authors:
William La Cava,
Lee Spector,
Kourosh Danai
Abstract:
Lexicase selection is a parent selection method that considers test cases separately, rather than in aggregate, when performing parent selection. It performs well in discrete error spaces but not on the continuous-valued problems that compose most system identification tasks. In this paper, we develop a new form of lexicase selection for symbolic regression, named epsilon-lexicase selection, that…
▽ More
Lexicase selection is a parent selection method that considers test cases separately, rather than in aggregate, when performing parent selection. It performs well in discrete error spaces but not on the continuous-valued problems that compose most system identification tasks. In this paper, we develop a new form of lexicase selection for symbolic regression, named epsilon-lexicase selection, that redefines the pass condition for individuals on each test case in a more effective way. We run a series of experiments on real-world and synthetic problems with several treatments of epsilon and quantify how epsilon affects parent selection and model performance. epsilon-lexicase selection is shown to be effective for regression, producing better fit models compared to other techniques such as tournament selection and age-fitness Pareto optimization. We demonstrate that epsilon can be adapted automatically for individual test cases based on the population performance distribution. Our experiments show that epsilon-lexicase selection with automatic epsilon produces the most accurate models across tested problems with negligible computational overhead. We show that behavioral diversity is exceptionally high in lexicase selection treatments, and that epsilon-lexicase selection makes use of more fitness cases when selecting parents than lexicase selection, which helps explain the performance improvement.
△ Less
Submitted 30 May, 2019;
originally announced May 2019.
-
Lexicase Selection of Specialists
Authors:
Thomas Helmuth,
Edward Pantridge,
Lee Spector
Abstract:
Lexicase parent selection filters the population by considering one random training case at a time, eliminating any individuals with errors for the current case that are worse than the best error in the selection pool, until a single individual remains. This process often stops before considering all training cases, meaning that it will ignore the error values on any cases that were not yet consid…
▽ More
Lexicase parent selection filters the population by considering one random training case at a time, eliminating any individuals with errors for the current case that are worse than the best error in the selection pool, until a single individual remains. This process often stops before considering all training cases, meaning that it will ignore the error values on any cases that were not yet considered. Lexicase selection can therefore select specialist individuals that have poor errors on some training cases, if they have great errors on others and those errors come near the start of the random list of cases used for the parent selection event in question. We hypothesize here that selecting these specialists, which may have poor total error, plays an important role in lexicase selection's observed performance advantages over error-aggregating parent selection methods such as tournament selection, which select specialists much less frequently. We conduct experiments examining this hypothesis, and find that lexicase selection's performance and diversity maintenance degrade when we deprive it of the ability of selecting specialists. These findings help explain the improved performance of lexicase selection compared to tournament selection, and suggest that specialists help drive evolution under lexicase selection toward global solutions.
△ Less
Submitted 2 January, 2020; v1 submitted 22 May, 2019;
originally announced May 2019.
-
A probabilistic and multi-objective analysis of lexicase selection and epsilon-lexicase selection
Authors:
William La Cava,
Thomas Helmuth,
Lee Spector,
Jason H. Moore
Abstract:
Lexicase selection is a parent selection method that considers training cases individually, rather than in aggregate, when performing parent selection. Whereas previous work has demonstrated the ability of lexicase selection to solve difficult problems in program synthesis and symbolic regression, the central goal of this paper is to develop the theoretical underpinnings that explain its performan…
▽ More
Lexicase selection is a parent selection method that considers training cases individually, rather than in aggregate, when performing parent selection. Whereas previous work has demonstrated the ability of lexicase selection to solve difficult problems in program synthesis and symbolic regression, the central goal of this paper is to develop the theoretical underpinnings that explain its performance. To this end, we derive an analytical formula that gives the expected probabilities of selection under lexicase selection, given a population and its behavior. In addition, we expand upon the relation of lexicase selection to many-objective optimization methods to describe the behavior of lexicase selection, which is to select individuals on the boundaries of Pareto fronts in high-dimensional space. We show analytically why lexicase selection performs more poorly for certain sizes of population and training cases, and show why it has been shown to perform more poorly in continuous error spaces. To address this last concern, we propose new variants of epsilon-lexicase selection, a method that modifies the pass condition in lexicase selection to allow near-elite individuals to pass cases, thereby improving selection performance with continuous errors. We show that epsilon-lexicase outperforms several diversity-maintenance strategies on a number of real-world and synthetic regression problems.
△ Less
Submitted 29 April, 2018; v1 submitted 15 September, 2017;
originally announced September 2017.
-
Group Size Effect on the Success of Wolves Hunting
Authors:
Ramon Escobedo,
Denys Dutykh,
Cristina Muro,
Lee Spector,
Raymond Cop**er
Abstract:
Social foraging shows unexpected features such as the existence of a group size threshold to accomplish a successful hunt. Above this threshold, additional individuals do not increase the probability of capturing the prey. Recent direct observations of wolves in Yellowstone Park show that the group size threshold when hunting its most formidable prey, bison, is nearly three times greater than when…
▽ More
Social foraging shows unexpected features such as the existence of a group size threshold to accomplish a successful hunt. Above this threshold, additional individuals do not increase the probability of capturing the prey. Recent direct observations of wolves in Yellowstone Park show that the group size threshold when hunting its most formidable prey, bison, is nearly three times greater than when hunting elk, a prey that is considerably less challenging to capture than bison. These observations provide empirical support to a computational particle model of group hunting which was previously shown to be effective in explaining why hunting success peaks at apparently small pack sizes when hunting elk. The model is based on considering two critical distances between wolves and prey: the minimal safe distance at which wolves stand from the prey, and the avoidance distance at which wolves move away from each other when they approach the prey. The minimal safe distance is longer when the prey is more dangerous to hunt. We show that the model explains effectively that the group size threshold is greater when the minimal safe distance is longer. Although both distances are longer when the prey is more dangerous, they contribute oppositely to the value of the group size threshold: the group size threshold is smaller when the avoidance distance is longer. This unexpected mechanism gives rise to a global increase of the group size threshold when considering bison with respect to elk, but other prey more dangerous than elk can lead to specific critical distances that can give rise to the same group size threshold. Our results show that the computational model can guide further research on group size effects, suggesting that more experimental observations should be obtained for other kind of prey as e.g. moose.
△ Less
Submitted 4 August, 2015;
originally announced August 2015.
-
Multiple orbital contributions to molecular high-harmonic generation in an asymmetric top
Authors:
Limor S. Spector,
Shungo Miyabe,
Alvaro Magana,
Simon Petretti,
Piero Decleva,
Todd Martinez,
Alejandro Saenz,
Markus Guehr,
Philip H. Bucksbaum
Abstract:
High-order harmonic generation (HHG) in aligned linear molecules can offer valuable information about strong-field interactions in lower-lying molecular orbitals, but extracting this information is difficult for three-dimensional molecular geometries. Our measurements of the asymmetric top SO2 show large axis dependencies, which change with harmonic order. The analysis shows that these spectral fe…
▽ More
High-order harmonic generation (HHG) in aligned linear molecules can offer valuable information about strong-field interactions in lower-lying molecular orbitals, but extracting this information is difficult for three-dimensional molecular geometries. Our measurements of the asymmetric top SO2 show large axis dependencies, which change with harmonic order. The analysis shows that these spectral features must be due to field ionization and recombination from multiple orbitals during HHG. We expect that HHG can probe orbital dependencies using this approach for a broad class of asymmetric-top molecules.
△ Less
Submitted 16 August, 2013;
originally announced August 2013.
-
Delayed Ultrafast X-ray Auger Probing (DUXAP) of Nucleobase Ultraviolet Photoprotection
Authors:
B. K. McFarland,
J. P. Farrell,
S. Miyabe,
F. Tarantelli,
A. Aguilar,
N. Berrah,
C. Bostedt,
J. Bozek,
P. H. Bucksbaum,
J. C. Castagna,
R. Coffee,
J. Cryan,
L. Fang,
R. Feifel,
K. Gaffney,
J. Glownia,
T. Martinez,
M. Mucke,
B. Murphy,
A. Natan,
T. Osipov,
V . Petrovic,
S. Schorb,
Th. Schultz,
L. Spector
, et al. (6 additional authors not shown)
Abstract:
We present a new method for ultrafast spectroscopy of molecular photoexcited dynamics. The technique uses a pair of femtosecond pulses: a photoexcitation pulse initiating excited state dynamics followed by a soft x-ray (SXR) probe pulse that core ionizes certain atoms inside the molecule. We observe the Auger decay of the core hole as a function of delay between the photoexcitation and SXR pulses.…
▽ More
We present a new method for ultrafast spectroscopy of molecular photoexcited dynamics. The technique uses a pair of femtosecond pulses: a photoexcitation pulse initiating excited state dynamics followed by a soft x-ray (SXR) probe pulse that core ionizes certain atoms inside the molecule. We observe the Auger decay of the core hole as a function of delay between the photoexcitation and SXR pulses. The core hole decay is particularly sensitive to the local valence electrons near the core and shows new types of propensity rules, compared to dipole selection rules in SXR absorption or emission spectroscopy. We apply the delayed ultrafast x-ray Auger probing (DUXAP) method to the specific problem of nucleobase photoprotection to demonstrate its potential. The ultraviolet photoexcited ππ* states of nucleobases are prone to chemical reactions with neighboring bases. To avoid this, the single molecules funnel the ππ* population to lower lying electronic states on an ultrafast timescale under violation of the Born-Oppenheimer approximation. The new type of propensity rule, which is confirmed by Auger decay simulations, allows us to have increased sensitivity on the direct relaxation from the ππ* state to the vibrationally hot electronic ground state. For the nucleobase thymine, we measure a decay constant of 300 fs in agreement with previous quantum chemical simulations.
△ Less
Submitted 14 January, 2013;
originally announced January 2013.
-
Orientational decomposition of molecular high harmonic emission in three dimensions
Authors:
Limor S. Spector,
Maxim Artamonov,
Shungo Miyabe,
Todd Martinez,
Tamar Seideman,
Markus Guehr,
Philip H. Bucksbaum
Abstract:
An important goal in molecular physics and chemistry today is to obtain structure-dependent information about molecular function to obtain a deeper understanding into chemical reactions. However, until now, asymmetric tops, which comprise the widest and most general class of molecules, remain principally unexplored. This gap is particularly evident in high harmonic generation (HHG). HHG has succes…
▽ More
An important goal in molecular physics and chemistry today is to obtain structure-dependent information about molecular function to obtain a deeper understanding into chemical reactions. However, until now, asymmetric tops, which comprise the widest and most general class of molecules, remain principally unexplored. This gap is particularly evident in high harmonic generation (HHG). HHG has successfully obtained structural information about electron hole pairs or orbitals for simple linear molecules. Unfortunately, for more complicated molecules, the emission from different molecular directions interfere, concealing individual angular signatures. Here we introduce a method to extract orientation-dependent information from asymmetric tops and apply it to the sulfur dioxide (SO2) molecule. We use the rotational revival structure to decompose the angular contributions of HHG emission. This method also extends HHG-based tomographic imaging into three dimensions and makes it applicable to a much wider class of systems than previously envisioned. Our results suggest that HHG is a powerful tool to probe electron orbital structure and dynamics of complex molecules.
△ Less
Submitted 18 January, 2013; v1 submitted 10 July, 2012;
originally announced July 2012.
-
Strong field ionization to multiple electronic states in water
Authors:
Joe P. Farrell,
Simon Petretti,
Johann Förster,
Brian K. McFarland,
Limor S. Spector,
Yulian V. Vanne,
Piero Decleva,
Philip H. Bucksbaum,
Alejandro Saenz,
Markus Gühr
Abstract:
High harmonic spectra show that laser-induced strong field ionization of water has a significant contribution from an inner-valence orbital. Our experiment uses the ratio of H2O and D2O high harmonic yields to isolate the characteristic nuclear motion of the molecular ionic states. The nuclear motion initiated via ionization of the highest occupied molecular orbital (HOMO) is small and is expected…
▽ More
High harmonic spectra show that laser-induced strong field ionization of water has a significant contribution from an inner-valence orbital. Our experiment uses the ratio of H2O and D2O high harmonic yields to isolate the characteristic nuclear motion of the molecular ionic states. The nuclear motion initiated via ionization of the highest occupied molecular orbital (HOMO) is small and is expected to lead to similar harmonic yields for the two isotopes. In contrast, ionization of the second least bound orbital (HOMO-1) exhibits itself via a strong bending motion which creates a significant isotope effect. We elaborate on this interpretation by simulating strong field ionization and high harmonic generation from the water isotopes using the time-dependent Schrödinger equation. We expect that this isotope marking scheme for probing excited ionic states in strong field processes can be generalized to other molecules.
△ Less
Submitted 22 March, 2011;
originally announced March 2011.
-
Influence of Phase Matching on the Cooper Minimum in Ar High Harmonic Spectra
Authors:
J. P. Farrell,
L. S. Spector,
B. K. McFarland,
P. H. Bucksbaum,
M. Gühr,
M. B. Gaarde,
K. J. Schafer
Abstract:
We study the influence of phase matching on interference minima in high harmonic spectra. We concentrate on structures in atoms due to interference of different angular momentum channels during recombination. We use the Cooper minimum (CM) in argon at 47 eV as a marker in the harmonic spectrum. We measure 2d harmonic spectra in argon as a function of wavelength and angular divergence. While we ide…
▽ More
We study the influence of phase matching on interference minima in high harmonic spectra. We concentrate on structures in atoms due to interference of different angular momentum channels during recombination. We use the Cooper minimum (CM) in argon at 47 eV as a marker in the harmonic spectrum. We measure 2d harmonic spectra in argon as a function of wavelength and angular divergence. While we identify a clear CM in the spectrum when the target gas jet is placed after the laser focus, we find that the appearance of the CM varies with angular divergence and can even be completely washed out when the gas jet is placed closer to the focus. We also show that the argon CM appears at different wavelengths in harmonic and photo-absorption spectra measured under conditions independent of any wavelength calibration. We model the experiment with a simulation based on coupled solutions of the time-dependent Schrödinger equation and the Maxwell wave equation, including both the single atom response and macroscopic effects of propagation. The single atom calculations confirm that the ground state of argon can be represented by its field free $p$ symmetry, despite the strong laser field used in high harmonic generation. Because of this, the CM structure in the harmonic spectrum can be described as the interference of continuum $s$ and $d$ channels, whose relative phase jumps by $π$ at the CM energy, resulting in a minimum shifted from the photoionization result. We also show that the full calculations reproduce the dependence of the CM on the macroscopic conditions. We calculate simple phase matching factors as a function of harmonic order and explain our experimental and theoretical observation in terms of the effect of phase matching on the shape of the harmonic spectrum. Phase matching must be taken into account to fully understand spectral features related to HHG spectroscopy.
△ Less
Submitted 4 November, 2010;
originally announced November 2010.
-
A quantum circuit for OR
Authors:
Howard Barnum,
Herbert J. Bernstein,
Lee Spector
Abstract:
We give the first quantum circuit for computing $f(0)$ OR $f(1)$ more reliably than is classically possible with a single evaluation of the function. OR therefore joins XOR (i.e. parity, $f(0) \oplus f(1)$) to give the full set of logical connectives (up to relabeling of inputs and outputs) for which there is quantum speedup. The XOR algorithm is of fundamental importance in quantum computation;…
▽ More
We give the first quantum circuit for computing $f(0)$ OR $f(1)$ more reliably than is classically possible with a single evaluation of the function. OR therefore joins XOR (i.e. parity, $f(0) \oplus f(1)$) to give the full set of logical connectives (up to relabeling of inputs and outputs) for which there is quantum speedup. The XOR algorithm is of fundamental importance in quantum computation; our OR algorithm (found with the aid of genetic programming), may represent a new quantum computational effect, also useful as a ``subroutine''.
△ Less
Submitted 8 October, 1999; v1 submitted 16 July, 1999;
originally announced July 1999.