Search | arXiv e-print repository

Making Task-Oriented Dialogue Datasets More Natural by Synthetically Generating Indirect User Requests

Authors: Amogh Mannekote, **seok Nam, Ziming Li, Jian Gao, Kristy Elizabeth Boyer, Bonnie J. Dorr

Abstract: Indirect User Requests (IURs), such as "It's cold in here" instead of "Could you please increase the temperature?" are common in human-human task-oriented dialogue and require world knowledge and pragmatic reasoning from the listener. While large language models (LLMs) can handle these requests effectively, smaller models deployed on virtual assistants often struggle due to resource constraints. M… ▽ More Indirect User Requests (IURs), such as "It's cold in here" instead of "Could you please increase the temperature?" are common in human-human task-oriented dialogue and require world knowledge and pragmatic reasoning from the listener. While large language models (LLMs) can handle these requests effectively, smaller models deployed on virtual assistants often struggle due to resource constraints. Moreover, existing task-oriented dialogue benchmarks lack sufficient examples of complex discourse phenomena such as indirectness. To address this, we propose a set of linguistic criteria along with an LLM-based pipeline for generating realistic IURs to test natural language understanding (NLU) and dialogue state tracking (DST) models before deployment in a new domain. We also release IndirectRequests, a dataset of IURs based on the Schema Guided Dialog (SGD) corpus, as a comparative testbed for evaluating the performance of smaller models in handling indirect requests. △ Less

Submitted 16 June, 2024; v1 submitted 11 June, 2024; originally announced June 2024.

arXiv:2405.01014 [pdf, ps, other]

Proven Runtime Guarantees for How the MOEA/D Computes the Pareto Front From the Subproblem Solutions

Authors: Benjamin Doerr, Martin S. Krejca, Noé Weeks

Abstract: The decomposition-based multi-objective evolutionary algorithm (MOEA/D) does not directly optimize a given multi-objective function $f$, but instead optimizes $N + 1$ single-objective subproblems of $f$ in a co-evolutionary manner. It maintains an archive of all non-dominated solutions found and outputs it as approximation to the Pareto front. Once the MOEA/D found all optima of the subproblems (t… ▽ More The decomposition-based multi-objective evolutionary algorithm (MOEA/D) does not directly optimize a given multi-objective function $f$, but instead optimizes $N + 1$ single-objective subproblems of $f$ in a co-evolutionary manner. It maintains an archive of all non-dominated solutions found and outputs it as approximation to the Pareto front. Once the MOEA/D found all optima of the subproblems (the $g$-optima), it may still miss Pareto optima of $f$. The algorithm is then tasked to find the remaining Pareto optima directly by mutating the $g$-optima. In this work, we analyze for the first time how the MOEA/D with only standard mutation operators computes the whole Pareto front of the OneMinMax benchmark when the $g$-optima are a strict subset of the Pareto front. For standard bit mutation, we prove an expected runtime of $O(n N \log n + n^{n/(2N)} N \log n)$ function evaluations. Especially for the second, more interesting phase when the algorithm start with all $g$-optima, we prove an $Ω(n^{(1/2)(n/N + 1)} \sqrt{N} 2^{-n/N})$ expected runtime. This runtime is super-polynomial if $N = o(n)$, since this leaves large gaps between the $g$-optima, which require costly mutations to cover. For power-law mutation with exponent $β\in (1, 2)$, we prove an expected runtime of $O\left(n N \log n + n^β \log n\right)$ function evaluations. The $O\left(n^β \log n\right)$ term stems from the second phase of starting with all $g$-optima, and it is independent of the number of subproblems $N$. This leads to a huge speedup compared to the lower bound for standard bit mutation. In general, our overall bound for power-law suggests that the MOEA/D performs best for $N = O(n^{β- 1})$, resulting in an $O(n^β\log n)$ bound. In contrast to standard bit mutation, smaller values of $N$ are better for power-law mutation, as it is capable of easily creating missing solutions. △ Less

Submitted 3 May, 2024; v1 submitted 2 May, 2024; originally announced May 2024.

arXiv:2404.12746 [pdf, ps, other]

Near-Tight Runtime Guarantees for Many-Objective Evolutionary Algorithms

Authors: Simon Wietheger, Benjamin Doerr

Abstract: Despite significant progress in the field of mathematical runtime analysis of multi-objective evolutionary algorithms (MOEAs), the performance of MOEAs on discrete many-objective problems is little understood. In particular, the few existing bounds for the SEMO, global SEMO, and SMS-EMOA algorithms on classic benchmarks are all roughly quadratic in the size of the Pareto front. In this work, we pr… ▽ More Despite significant progress in the field of mathematical runtime analysis of multi-objective evolutionary algorithms (MOEAs), the performance of MOEAs on discrete many-objective problems is little understood. In particular, the few existing bounds for the SEMO, global SEMO, and SMS-EMOA algorithms on classic benchmarks are all roughly quadratic in the size of the Pareto front. In this work, we prove near-tight runtime guarantees for these three algorithms on the four most common benchmark problems OneMinMax, CountingOnesCountingZeros, LeadingOnesTrailingZeros, and OneJumpZeroJump, and this for arbitrary numbers of objectives. Our bounds depend only linearly on the Pareto front size, showing that these MOEAs on these benchmarks cope much better with many objectives than what previous works suggested. Our bounds are tight apart from small polynomial factors in the number of objectives and length of bitstrings. This is the first time that such tight bounds are proven for many-objective uses of these MOEAs. While it is known that such results cannot hold for the NSGA-II, we do show that our bounds, via a recent structural result, transfer to the NSGA-III algorithm. △ Less

Submitted 11 June, 2024; v1 submitted 19 April, 2024; originally announced April 2024.

arXiv:2404.09371 [pdf, other]

The Effect of Data Partitioning Strategy on Model Generalizability: A Case Study of Morphological Segmentation

Authors: Zoey Liu, Bonnie J. Dorr

Abstract: Recent work to enhance data partitioning strategies for more realistic model evaluation face challenges in providing a clear optimal choice. This study addresses these challenges, focusing on morphological segmentation and synthesizing limitations related to language diversity, adoption of multiple datasets and splits, and detailed model comparisons. Our study leverages data from 19 languages, inc… ▽ More Recent work to enhance data partitioning strategies for more realistic model evaluation face challenges in providing a clear optimal choice. This study addresses these challenges, focusing on morphological segmentation and synthesizing limitations related to language diversity, adoption of multiple datasets and splits, and detailed model comparisons. Our study leverages data from 19 languages, including ten indigenous or endangered languages across 10 language families with diverse morphological systems (polysynthetic, fusional, and agglutinative) and different degrees of data availability. We conduct large-scale experimentation with varying sized combinations of training and evaluation sets as well as new test data. Our results show that, when faced with new test data: (1) models trained from random splits are able to achieve higher numerical scores; (2) model rankings derived from random splits tend to generalize more consistently. △ Less

Submitted 14 April, 2024; originally announced April 2024.

Comments: Accepted to 2024 Annual Conference of the North American Chapter of the Association for Computational Linguistics (16 pages including 9 tables and 1 figure)

ACM Class: I.2.7

arXiv:2404.04018 [pdf, ps, other]

doi 10.1145/3638529.3654140

Superior Genetic Algorithms for the Target Set Selection Problem Based on Power-Law Parameter Choices and Simple Greedy Heuristics

Authors: Benjamin Doerr, Martin S. Krejca, Nguyen Vu

Abstract: The target set selection problem (TSS) asks for a set of vertices such that an influence spreading process started in these vertices reaches the whole graph. The current state of the art for this NP-hard problem are three recently proposed randomized search heuristics, namely a biased random-key genetic algorithm (BRKGA) obtained from extensive parameter tuning, a max-min ant system (MMAS), and a… ▽ More The target set selection problem (TSS) asks for a set of vertices such that an influence spreading process started in these vertices reaches the whole graph. The current state of the art for this NP-hard problem are three recently proposed randomized search heuristics, namely a biased random-key genetic algorithm (BRKGA) obtained from extensive parameter tuning, a max-min ant system (MMAS), and a MMAS using Q-learning with a graph convolutional network. We show that the BRKGA with two simple modifications and without the costly parameter tuning obtains significantly better results. Our first modification is to simply choose all parameters of the BRKGA in each iteration randomly from a power-law distribution. The resulting parameterless BRKGA is already competitive with the tuned BRKGA, as our experiments on the previously used benchmarks show. We then add a natural greedy heuristic, namely to repeatedly discard small-degree vertices that are not necessary for reaching the whole graph. The resulting algorithm consistently outperforms all of the state-of-the-art algorithms. Besides providing a superior algorithm for the TSS problem, this work shows that randomized parameter choices and elementary greedy heuristics can give better results than complex algorithms and costly parameter tuning. △ Less

Submitted 5 April, 2024; originally announced April 2024.

arXiv:2404.03838 [pdf, other]

A Block-Coordinate Descent EMO Algorithm: Theoretical and Empirical Analysis

Authors: Benjamin Doerr, Joshua Knowles, Aneta Neumann, Frank Neumann

Abstract: We consider whether conditions exist under which block-coordinate descent is asymptotically efficient in evolutionary multi-objective optimization, addressing an open problem. Block-coordinate descent, where an optimization problem is decomposed into $k$ blocks of decision variables and each of the blocks is optimized (with the others fixed) in a sequence, is a technique used in some large-scale o… ▽ More We consider whether conditions exist under which block-coordinate descent is asymptotically efficient in evolutionary multi-objective optimization, addressing an open problem. Block-coordinate descent, where an optimization problem is decomposed into $k$ blocks of decision variables and each of the blocks is optimized (with the others fixed) in a sequence, is a technique used in some large-scale optimization problems such as airline scheduling, however its use in multi-objective optimization is less studied. We propose a block-coordinate version of GSEMO and compare its running time to the standard GSEMO algorithm. Theoretical and empirical results on a bi-objective test function, a variant of LOTZ, serve to demonstrate the existence of cases where block-coordinate descent is faster. The result may yield wider insights into this class of algorithms. △ Less

Submitted 10 April, 2024; v1 submitted 4 April, 2024; originally announced April 2024.

Comments: Accepted at GECCO 2024

arXiv:2404.02090 [pdf, ps, other]

Already Moderate Population Sizes Provably Yield Strong Robustness to Noise

Authors: Denis Antipov, Benjamin Doerr, Alexandra Ivanova

Abstract: Experience shows that typical evolutionary algorithms can cope well with stochastic disturbances such as noisy function evaluations. In this first mathematical runtime analysis of the $(1+λ)$ and $(1,λ)$ evolutionary algorithms in the presence of prior bit-wise noise, we show that both algorithms can tolerate constant noise probabilities without increasing the asymptotic runtime on the OneMax be… ▽ More Experience shows that typical evolutionary algorithms can cope well with stochastic disturbances such as noisy function evaluations. In this first mathematical runtime analysis of the $(1+λ)$ and $(1,λ)$ evolutionary algorithms in the presence of prior bit-wise noise, we show that both algorithms can tolerate constant noise probabilities without increasing the asymptotic runtime on the OneMax benchmark. For this, a population size $λ$ suffices that is at least logarithmic in the problem size $n$. The only previous result in this direction regarded the less realistic one-bit noise model, required a population size super-linear in the problem size, and proved a runtime guarantee roughly cubic in the noiseless runtime for the OneMax benchmark. Our significantly stronger results are based on the novel proof argument that the noiseless offspring can be seen as a biased uniform crossover between the parent and the noisy offspring. We are optimistic that the technical lemmas resulting from this insight will find applications also in future mathematical runtime analyses of evolutionary algorithms. △ Less

Submitted 13 May, 2024; v1 submitted 2 April, 2024; originally announced April 2024.

Comments: Full version of the same-titled paper accepted at GECCO 2024

arXiv:2403.08153 [pdf, ps, other]

The Runtime of Random Local Search on the Generalized Needle Problem

Authors: Benjamin Doerr, Andrew James Kelley

Abstract: In their recent work, C. Doerr and Krejca (Transactions on Evolutionary Computation, 2023) proved upper bounds on the expected runtime of the randomized local search heuristic on generalized Needle functions. Based on these upper bounds, they deduce in a not fully rigorous manner a drastic influence of the needle radius $k$ on the runtime. In this short article, we add the missing lower bound ne… ▽ More In their recent work, C. Doerr and Krejca (Transactions on Evolutionary Computation, 2023) proved upper bounds on the expected runtime of the randomized local search heuristic on generalized Needle functions. Based on these upper bounds, they deduce in a not fully rigorous manner a drastic influence of the needle radius $k$ on the runtime. In this short article, we add the missing lower bound necessary to determine the influence of parameter $k$ on the runtime. To this aim, we derive an exact description of the expected runtime, which also significantly improves the upper bound given by C. Doerr and Krejca. We also describe asymptotic estimates of the expected runtime. △ Less

Submitted 19 March, 2024; v1 submitted 12 March, 2024; originally announced March 2024.

Comments: 18 pages

arXiv:2402.14155 [pdf, other]

Can Similarity-Based Domain-Ordering Reduce Catastrophic Forgetting for Intent Recognition?

Authors: Amogh Mannekote, Xiaoyi Tian, Kristy Elizabeth Boyer, Bonnie J. Dorr

Abstract: Task-oriented dialogue systems are expected to handle a constantly expanding set of intents and domains even after they have been deployed to support more and more functionalities. To live up to this expectation, it becomes critical to mitigate the catastrophic forgetting problem (CF) that occurs in continual learning (CL) settings for a task such as intent recognition. While existing dialogue sys… ▽ More Task-oriented dialogue systems are expected to handle a constantly expanding set of intents and domains even after they have been deployed to support more and more functionalities. To live up to this expectation, it becomes critical to mitigate the catastrophic forgetting problem (CF) that occurs in continual learning (CL) settings for a task such as intent recognition. While existing dialogue systems research has explored replay-based and regularization-based methods to this end, the effect of domain ordering on the CL performance of intent recognition models remains unexplored. If understood well, domain ordering has the potential to be an orthogonal technique that can be leveraged alongside existing techniques such as experience replay. Our work fills this gap by comparing the impact of three domain-ordering strategies (min-sum path, max-sum path, random) on the CL performance of a generative intent recognition model. Our findings reveal that the min-sum path strategy outperforms the others in reducing catastrophic forgetting when training on the 220M T5-Base model. However, this advantage diminishes with the larger 770M T5-Large model. These results underscores the potential of domain ordering as a complementary strategy for mitigating catastrophic forgetting in continually learning intent recognition models, particularly in resource-constrained scenarios. △ Less

Submitted 21 February, 2024; originally announced February 2024.

arXiv:2312.10290 [pdf, ps, other]

doi 10.1609/AAAI.V38I18.30077

Runtime Analysis of the SMS-EMOA for Many-Objective Optimization

Authors: Weijie Zheng, Benjamin Doerr

Abstract: The classic NSGA-II was recently proven to have considerable difficulties in many-objective optimization. This paper conducts the first rigorous runtime analysis in many objectives for the SMS-EMOA, a steady-state NSGA-II that uses the hypervolume contribution instead of the crowding distance as the second selection criterion. To this aim, we first propose a many-objective counterpart, the m-obj… ▽ More The classic NSGA-II was recently proven to have considerable difficulties in many-objective optimization. This paper conducts the first rigorous runtime analysis in many objectives for the SMS-EMOA, a steady-state NSGA-II that uses the hypervolume contribution instead of the crowding distance as the second selection criterion. To this aim, we first propose a many-objective counterpart, the m-objective mOJZJ, of the bi-objective OJZJ, which is the first many-objective multimodal benchmark for runtime analysis. We prove that SMS-EMOA computes the full Pareto front of this benchmark in an expected number of $O(μM n^k)$ iterations, where $n$ denotes the problem size (length of the bit-string representation), $k$ the gap size (a difficulty parameter of the problem), $M=(2n/m-2k+3)^{m/2}$ the size of the Pareto front, and $μ$ the population size (at least the same size as the largest incomparable set). This result together with the existing negative result for the original NSGA-II shows that, in principle, the general approach of the NSGA-II is suitable for many-objective optimization, but the crowding distance as tie-breaker has deficiencies. We obtain three additional insights on the SMS-EMOA. Different from a recent result for the bi-objective OJZJ benchmark, a recently proposed stochastic population update often does not help for mOJZJ. It at most results in a speed-up by a factor of order $2^{k} / μ$, which is $Θ(1)$ for large $m$, such as $m>k$. On the positive side, we prove that heavy-tailed mutation irrespective of the number $m$ of objectives results in a speed-up of order $k^{0.5+k-β}/e^k$, the same advantage as previously shown for the bi-objective case. Finally, we conduct the first runtime analyses of the SMS-EMOA on the classic bi-objective OneMinMax and LOTZ benchmarks and show that the SMS-EMOA has a performance comparable to the GSEMO and the NSGA-II. △ Less

Submitted 12 June, 2024; v1 submitted 15 December, 2023; originally announced December 2023.

Comments: Full version of a paper accepted in AAAI 2024

Journal ref: Conference on Artificial Intelligence, AAAI 2024, pages 20874-20882

arXiv:2311.11215 [pdf, other]

SPLAIN: Augmenting Cybersecurity Warnings with Reasons and Data

Authors: Vera A. Kazakova, Jena D. Hwang, Bonnie J. Dorr, Yorick Wilks, J. Blake Gage, Alex Memory, Mark A. Clark

Abstract: Effective cyber threat recognition and prevention demand comprehensible forecasting systems, as prior approaches commonly offer limited and, ultimately, unconvincing information. We introduce Simplified Plaintext Language (SPLAIN), a natural language generator that converts warning data into user-friendly cyber threat explanations. SPLAIN is designed to generate clear, actionable outputs, incorpor… ▽ More Effective cyber threat recognition and prevention demand comprehensible forecasting systems, as prior approaches commonly offer limited and, ultimately, unconvincing information. We introduce Simplified Plaintext Language (SPLAIN), a natural language generator that converts warning data into user-friendly cyber threat explanations. SPLAIN is designed to generate clear, actionable outputs, incorporating hierarchically organized explanatory details about input data and system functionality. Given the inputs of individual sensor-induced forecasting signals and an overall warning from a fusion module, SPLAIN queries each signal for information on contributing sensors and data signals. This collected data is processed into a coherent English explanation, encompassing forecasting, sensing, and data elements for user review. SPLAIN's template-based approach ensures consistent warning structure and vocabulary. SPLAIN's hierarchical output structure allows each threat and its components to be expanded to reveal underlying explanations on demand. Our conclusions emphasize the need for designers to specify the "how" and "why" behind cyber warnings, advocate for simple structured templates in generating consistent explanations, and recognize that direct causal links in Machine Learning approaches may not always be identifiable, requiring some explanations to focus on general methodologies, such as model and training data. △ Less

Submitted 18 November, 2023; originally announced November 2023.

Comments: Presented at FLAIRS-2019 as poster (see ancillary files)

ACM Class: I.2

Journal ref: FLAIRS-2019

arXiv:2310.04042 [pdf, other]

doi 10.1016/j.tcs.2023.114074

Bivariate Estimation-of-Distribution Algorithms Can Find an Exponential Number of Optima

Authors: Benjamin Doerr, Martin S. Krejca

Abstract: Finding a large set of optima in a multimodal optimization landscape is a challenging task. Classical population-based evolutionary algorithms typically converge only to a single solution. While this can be counteracted by applying niching strategies, the number of optima is nonetheless trivially bounded by the population size. Estimation-of-distribution algorithms (EDAs) are an alternative, maint… ▽ More Finding a large set of optima in a multimodal optimization landscape is a challenging task. Classical population-based evolutionary algorithms typically converge only to a single solution. While this can be counteracted by applying niching strategies, the number of optima is nonetheless trivially bounded by the population size. Estimation-of-distribution algorithms (EDAs) are an alternative, maintaining a probabilistic model of the solution space instead of a population. Such a model is able to implicitly represent a solution set far larger than any realistic population size. To support the study of how optimization algorithms handle large sets of optima, we propose the test function EqualBlocksOneMax (EBOM). It has an easy fitness landscape with exponentially many optima. We show that the bivariate EDA mutual-information-maximizing input clustering, without any problem-specific modification, quickly generates a model that behaves very similarly to a theoretically ideal model for EBOM, which samples each of the exponentially many optima with the same maximal probability. We also prove via mathematical means that no univariate model can come close to having this property: If the probability to sample an optimum is at least inverse-polynomial, there is a Hamming ball of logarithmic radius such that, with high probability, each sample is in this ball. △ Less

Submitted 6 October, 2023; originally announced October 2023.

Journal ref: Theoretical Computer Science 971: 114074 (2023)

arXiv:2307.06524 [pdf, other]

Agreement Tracking for Multi-Issue Negotiation Dialogues

Authors: Amogh Mannekote, Bonnie J. Dorr, Kristy Elizabeth Boyer

Abstract: Automated negotiation support systems aim to help human negotiators reach more favorable outcomes in multi-issue negotiations (e.g., an employer and a candidate negotiating over issues such as salary, hours, and promotions before a job offer). To be successful, these systems must accurately track agreements reached by participants in real-time. Existing approaches either focus on task-oriented dia… ▽ More Automated negotiation support systems aim to help human negotiators reach more favorable outcomes in multi-issue negotiations (e.g., an employer and a candidate negotiating over issues such as salary, hours, and promotions before a job offer). To be successful, these systems must accurately track agreements reached by participants in real-time. Existing approaches either focus on task-oriented dialogues or produce unstructured outputs, rendering them unsuitable for this objective. Our work introduces the novel task of agreement tracking for two-party multi-issue negotiations, which requires continuous monitoring of agreements within a structured state space. To address the scarcity of annotated corpora with realistic multi-issue negotiation dialogues, we use GPT-3 to build GPT-Negochat, a synthesized dataset that we make publicly available. We present a strong initial baseline for our task by transfer-learning a T5 model trained on the MultiWOZ 2.4 corpus. Pre-training T5-small and T5-base on MultiWOZ 2.4's DST task enhances results by 21% and 9% respectively over training solely on GPT-Negochat. We validate our method's sample-efficiency via smaller training subset experiments. By releasing GPT-Negochat and our baseline models, we aim to encourage further research in multi-issue negotiation dialogue agreement tracking. △ Less

Submitted 12 July, 2023; originally announced July 2023.

arXiv:2305.18736 [pdf, other]

LonXplain: Lonesomeness as a Consequence of Mental Disturbance in Reddit Posts

Authors: Muskan Garg, Chandni Saxena, Debabrata Samanta, Bonnie J. Dorr

Abstract: Social media is a potential source of information that infers latent mental states through Natural Language Processing (NLP). While narrating real-life experiences, social media users convey their feeling of loneliness or isolated lifestyle, impacting their mental well-being. Existing literature on psychological theories points to loneliness as the major consequence of interpersonal risk factors,… ▽ More Social media is a potential source of information that infers latent mental states through Natural Language Processing (NLP). While narrating real-life experiences, social media users convey their feeling of loneliness or isolated lifestyle, impacting their mental well-being. Existing literature on psychological theories points to loneliness as the major consequence of interpersonal risk factors, propounding the need to investigate loneliness as a major aspect of mental disturbance. We formulate lonesomeness detection in social media posts as an explainable binary classification problem, discovering the users at-risk, suggesting the need of resilience for early control. To the best of our knowledge, there is no existing explainable dataset, i.e., one with human-readable, annotated text spans, to facilitate further research and development in loneliness detection causing mental disturbance. In this work, three experts: a senior clinical psychologist, a rehabilitation counselor, and a social NLP researcher define annotation schemes and perplexity guidelines to mark the presence or absence of lonesomeness, along with the marking of text-spans in original posts as explanation, in 3,521 Reddit posts. We expect the public release of our dataset, LonXplain, and traditional classifiers as baselines via GitHub. △ Less

Submitted 30 May, 2023; originally announced May 2023.

arXiv:2305.18585 [pdf, other]

Exploiting Explainability to Design Adversarial Attacks and Evaluate Attack Resilience in Hate-Speech Detection Models

Authors: Pranath Reddy Kumbam, Sohaib Uddin Syed, Prashanth Thamminedi, Suhas Harish, Ian Perera, Bonnie J. Dorr

Abstract: The advent of social media has given rise to numerous ethical challenges, with hate speech among the most significant concerns. Researchers are attempting to tackle this problem by leveraging hate-speech detection and employing language models to automatically moderate content and promote civil discourse. Unfortunately, recent studies have revealed that hate-speech detection systems can be misled… ▽ More The advent of social media has given rise to numerous ethical challenges, with hate speech among the most significant concerns. Researchers are attempting to tackle this problem by leveraging hate-speech detection and employing language models to automatically moderate content and promote civil discourse. Unfortunately, recent studies have revealed that hate-speech detection systems can be misled by adversarial attacks, raising concerns about their resilience. While previous research has separately addressed the robustness of these models under adversarial attacks and their interpretability, there has been no comprehensive study exploring their intersection. The novelty of our work lies in combining these two critical aspects, leveraging interpretability to identify potential vulnerabilities and enabling the design of targeted adversarial attacks. We present a comprehensive and comparative analysis of adversarial robustness exhibited by various hate-speech detection models. Our study evaluates the resilience of these models against adversarial attacks using explainability techniques. To gain insights into the models' decision-making processes, we employ the Local Interpretable Model-agnostic Explanations (LIME) framework. Based on the explainability results obtained by LIME, we devise and execute targeted attacks on the text by leveraging the TextAttack tool. Our findings enhance the understanding of the vulnerabilities and strengths exhibited by state-of-the-art hate-speech detection models. This work underscores the importance of incorporating explainability in the development and evaluation of such models to enhance their resilience against adversarial attacks. Ultimately, this work paves the way for creating more robust and reliable hate-speech detection systems, fostering safer online environments and promoting ethical discourse on social media platforms. △ Less

Submitted 29 May, 2023; originally announced May 2023.

arXiv:2305.13459 [pdf, other]

doi 10.24963/ijcai.2023/613

The First Proven Performance Guarantees for the Non-Dominated Sorting Genetic Algorithm II (NSGA-II) on a Combinatorial Optimization Problem

Authors: Sacha Cerf, Benjamin Doerr, Benjamin Hebras, Yakob Kahane, Simon Wietheger

Abstract: The Non-dominated Sorting Genetic Algorithm-II (NSGA-II) is one of the most prominent algorithms to solve multi-objective optimization problems. Recently, the first mathematical runtime guarantees have been obtained for this algorithm, however only for synthetic benchmark problems. In this work, we give the first proven performance guarantees for a classic optimization problem, the NP-complete b… ▽ More The Non-dominated Sorting Genetic Algorithm-II (NSGA-II) is one of the most prominent algorithms to solve multi-objective optimization problems. Recently, the first mathematical runtime guarantees have been obtained for this algorithm, however only for synthetic benchmark problems. In this work, we give the first proven performance guarantees for a classic optimization problem, the NP-complete bi-objective minimum spanning tree problem. More specifically, we show that the NSGA-II with population size $N \ge 4((n-1) w_{\max} + 1)$ computes all extremal points of the Pareto front in an expected number of $O(m^2 n w_{\max} \log(n w_{\max}))$ iterations, where $n$ is the number of vertices, $m$ the number of edges, and $w_{\max}$ is the maximum edge weight in the problem instance. This result confirms, via mathematical means, the good performance of the NSGA-II observed empirically. It also shows that mathematical analyses of this algorithm are not only possible for synthetic benchmark problems, but also for more complex combinatorial optimization problems. As a side result, we also obtain a new analysis of the performance of the global SEMO algorithm on the bi-objective minimum spanning tree problem, which improves the previous best result by a factor of $|F|$, the number of extremal points of the Pareto front, a set that can be as large as $n w_{\max}$. The main reason for this improvement is our observation that both multi-objective evolutionary algorithms find the different extremal points in parallel rather than sequentially, as assumed in the previous proofs. △ Less

Submitted 9 June, 2023; v1 submitted 22 May, 2023; originally announced May 2023.

Comments: Author-generated version of a paper appearing in the proceedings of IJCAI 2023, with appendix

Journal ref: Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, Main Track. Pages 5522-5530. 2023

arXiv:2305.10259 [pdf, ps, other]

doi 10.24963/ijcai.2023/616

Runtime Analyses of Multi-Objective Evolutionary Algorithms in the Presence of Noise

Authors: Matthieu Dinot, Benjamin Doerr, Ulysse Hennebelle, Sebastian Will

Abstract: In single-objective optimization, it is well known that evolutionary algorithms also without further adjustments can tolerate a certain amount of noise in the evaluation of the objective function. In contrast, this question is not at all understood for multi-objective optimization. In this work, we conduct the first mathematical runtime analysis of a simple multi-objective evolutionary algorithm… ▽ More In single-objective optimization, it is well known that evolutionary algorithms also without further adjustments can tolerate a certain amount of noise in the evaluation of the objective function. In contrast, this question is not at all understood for multi-objective optimization. In this work, we conduct the first mathematical runtime analysis of a simple multi-objective evolutionary algorithm (MOEA) on a classic benchmark in the presence of noise in the objective functions. We prove that when bit-wise prior noise with rate $p \le α/n$, $α$ a suitable constant, is present, the \emph{simple evolutionary multi-objective optimizer} (SEMO) without any adjustments to cope with noise finds the Pareto front of the OneMinMax benchmark in time $O(n^2\log n)$, just as in the case without noise. Given that the problem here is to arrive at a population consisting of $n+1$ individuals witnessing the Pareto front, this is a surprisingly strong robustness to noise (comparably simple evolutionary algorithms cannot optimize the single-objective OneMax problem in polynomial time when $p = ω(\log(n)/n)$). Our proofs suggest that the strong robustness of the MOEA stems from its implicit diversity mechanism designed to enable it to compute a population covering the whole Pareto front. Interestingly this result only holds when the objective value of a solution is determined only once and the algorithm from that point on works with this, possibly noisy, objective value. We prove that when all solutions are reevaluated in each iteration, then any noise rate $p = ω(\log(n)/n^2)$ leads to a super-polynomial runtime. This is very different from single-objective optimization, where it is generally preferred to reevaluate solutions whenever their fitness is important and where examples are known such that not reevaluating solutions can lead to catastrophic performance losses. △ Less

Submitted 30 May, 2023; v1 submitted 17 May, 2023; originally announced May 2023.

Comments: Appears at IJCAI 2023

Journal ref: Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence. Main Track. Pages 5549-5557. 2023

arXiv:2305.04553 [pdf, ps, other]

Larger Offspring Populations Help the $(1 + (λ, λ))$ Genetic Algorithm to Overcome the Noise

Authors: Alexandra Ivanova, Denis Antipov, Benjamin Doerr

Abstract: Evolutionary algorithms are known to be robust to noise in the evaluation of the fitness. In particular, larger offspring population sizes often lead to strong robustness. We analyze to what extent the $(1+(λ,λ))$ genetic algorithm is robust to noise. This algorithm also works with larger offspring population sizes, but an intermediate selection step and a non-standard use of crossover as repair m… ▽ More Evolutionary algorithms are known to be robust to noise in the evaluation of the fitness. In particular, larger offspring population sizes often lead to strong robustness. We analyze to what extent the $(1+(λ,λ))$ genetic algorithm is robust to noise. This algorithm also works with larger offspring population sizes, but an intermediate selection step and a non-standard use of crossover as repair mechanism could render this algorithm less robust than, e.g., the simple $(1+λ)$ evolutionary algorithm. Our experimental analysis on several classic benchmark problems shows that this difficulty does not arise. Surprisingly, in many situations this algorithm is even more robust to noise than the $(1+λ)$~EA. △ Less

Submitted 8 May, 2023; originally announced May 2023.

Comments: Author-generated version of the same paper published at GECCO 2023

arXiv:2304.10848 [pdf, other]

doi 10.1145/3583131.3590390

How Well Does the Metropolis Algorithm Cope With Local Optima?

Authors: Benjamin Doerr, Taha El Ghazi El Houssaini, Amirhossein Rajabi, Carsten Witt

Abstract: The Metropolis algorithm (MA) is a classic stochastic local search heuristic. It avoids getting stuck in local optima by occasionally accepting inferior solutions. To better and in a rigorous manner understand this ability, we conduct a mathematical runtime analysis of the MA on the CLIFF benchmark. Apart from one local optimum, cliff functions are monotonically increasing towards the global optim… ▽ More The Metropolis algorithm (MA) is a classic stochastic local search heuristic. It avoids getting stuck in local optima by occasionally accepting inferior solutions. To better and in a rigorous manner understand this ability, we conduct a mathematical runtime analysis of the MA on the CLIFF benchmark. Apart from one local optimum, cliff functions are monotonically increasing towards the global optimum. Consequently, to optimize a cliff function, the MA only once needs to accept an inferior solution. Despite seemingly being an ideal benchmark for the MA to profit from its main working principle, our mathematical runtime analysis shows that this hope does not come true. Even with the optimal temperature (the only parameter of the MA), the MA optimizes most cliff functions less efficiently than simple elitist evolutionary algorithms (EAs), which can only leave the local optimum by generating a superior solution possibly far away. This result suggests that our understanding of why the MA is often very successful in practice is not yet complete. Our work also suggests to equip the MA with global mutation operators, an idea supported by our preliminary experiments. △ Less

Submitted 15 May, 2023; v1 submitted 21 April, 2023; originally announced April 2023.

Comments: To appear in the proceedings of GECCO 2023. With appendix containing all proofs. 28 pages

arXiv:2304.10414 [pdf, ps, other]

doi 10.1145/3583131.3590509

How the Move Acceptance Hyper-Heuristic Copes With Local Optima: Drastic Differences Between Jumps and Cliffs

Authors: Benjamin Doerr, Arthur Dremaux, Johannes Lutzeyer, Aurélien Stumpf

Abstract: In recent work, Lissovoi, Oliveto, and Warwicker (Artificial Intelligence (2023)) proved that the Move Acceptance Hyper-Heuristic (MAHH) leaves the local optimum of the multimodal cliff benchmark with remarkable efficiency. With its $O(n^3)$ runtime, for almost all cliff widths $d,$ the MAHH massively outperforms the $Θ(n^d)$ runtime of simple elitist evolutionary algorithms (EAs). For the most pr… ▽ More In recent work, Lissovoi, Oliveto, and Warwicker (Artificial Intelligence (2023)) proved that the Move Acceptance Hyper-Heuristic (MAHH) leaves the local optimum of the multimodal cliff benchmark with remarkable efficiency. With its $O(n^3)$ runtime, for almost all cliff widths $d,$ the MAHH massively outperforms the $Θ(n^d)$ runtime of simple elitist evolutionary algorithms (EAs). For the most prominent multimodal benchmark, the jump functions, the given runtime estimates of $O(n^{2m} m^{-Θ(m)})$ and $Ω(2^{Ω(m)})$, for gap size $m \ge 2$, are far apart and the real performance of MAHH is still an open question. In this work, we resolve this question. We prove that for any choice of the MAHH selection parameter~$p$, the expected runtime of the MAHH on a jump function with gap size $m = o(n^{1/2})$ is at least $Ω(n^{2m-1} / (2m-1)!)$. This renders the MAHH much slower than simple elitist evolutionary algorithms with their typical $O(n^m)$ runtime. We also show that the MAHH with the global bit-wise mutation operator instead of the local one-bit operator optimizes jump functions in time $O(\min\{m n^m,\frac{n^{2m-1}}{m!Ω(m)^{m-2}}\})$, essentially the minimum of the optimization times of the $(1+1)$ EA and the MAHH. This suggests that combining several ways to cope with local optima can be a fruitful approach. △ Less

Submitted 20 April, 2023; originally announced April 2023.

Comments: Author generated version of a paper appearing in the proceedings of GECCO. With an appendix containing the material omitted in the conference version for reasons of space

arXiv:2303.11150 [pdf, other]

Randomized Rumor Spreading Revisited (Long Version)

Authors: Benjamin Doerr, Anatolii Kostrygin

Abstract: We develop a simple and generic method to analyze randomized rumor spreading processes in fully connected networks. In contrast to all previous works, which heavily exploit the precise definition of the process under investigation, we only need to understand the probability and the covariance of the events that uninformed nodes become informed. This universality allows us to easily analyze the cla… ▽ More We develop a simple and generic method to analyze randomized rumor spreading processes in fully connected networks. In contrast to all previous works, which heavily exploit the precise definition of the process under investigation, we only need to understand the probability and the covariance of the events that uninformed nodes become informed. This universality allows us to easily analyze the classic push, pull, and push-pull protocols both in their pure version and in several variations such as messages failing with constant probability or nodes calling a random number of others each round. Some dynamic models can be analyzed as well, e.g., when the network is a $G(n,p)$ random graph sampled independently each round [Clementi et al. (ESA 2013)]. Despite this generality, our method determines the expected rumor spreading time precisely apart from additive constants, which is more precise than almost all previous works. We also prove tail bounds showing that a deviation from the expectation by more than an additive number of $r$ rounds occurs with probability at most $\exp(-Ω(r))$. We further use our method to discuss the common assumption that nodes can answer any number of incoming calls. We observe that the restriction that only one call can be answered leads to a significant increase of the runtime of the push-pull protocol. In particular, the double logarithmic end phase of the process now takes logarithmic time. This also increases the message complexity from the asymptotically optimal $Θ(n \log\log n)$ [Karp, Shenker, Schindelhauer, Vöcking (FOCS 2000)] to $Θ(n \log n)$. We propose a simple variation of the push-pull protocol that reverts back to the double logarithmic end phase and thus to the $Θ(n \log\log n)$ message complexity. △ Less

Submitted 20 March, 2023; originally announced March 2023.

Comments: Version submitted to ICALP 2017. This version contains all proofs that were omitted in the final conference version

arXiv:2303.07455 [pdf, ps, other]

doi 10.1016/j.artint.2023.103906

(1+1) Genetic Programming With Functionally Complete Instruction Sets Can Evolve Boolean Conjunctions and Disjunctions with Arbitrarily Small Error

Authors: Benjamin Doerr, Andrei Lissovoi, Pietro S. Oliveto

Abstract: Recently it has been proven that simple GP systems can efficiently evolve a conjunction of $n$ variables if they are equipped with the minimal required components. In this paper, we make a considerable step forward by analysing the behaviour and performance of a GP system for evolving a Boolean conjunction or disjunction of $n$ variables using a complete function set that allows the expression of… ▽ More Recently it has been proven that simple GP systems can efficiently evolve a conjunction of $n$ variables if they are equipped with the minimal required components. In this paper, we make a considerable step forward by analysing the behaviour and performance of a GP system for evolving a Boolean conjunction or disjunction of $n$ variables using a complete function set that allows the expression of any Boolean function of up to $n$ variables. First we rigorously prove that a GP system using the complete truth table to evaluate the program quality, and equipped with both the AND and OR operators and positive literals, evolves the exact target function in $O(\ell n \log^2 n)$ iterations in expectation, where $\ell \geq n$ is a limit on the size of any accepted tree. Additionally, we show that when a polynomial sample of possible inputs is used to evaluate the solution quality, conjunctions or disjunctions with any polynomially small generalisation error can be evolved with probability $1 - O(\log^2(n)/n)$. The latter result also holds if GP uses AND, OR and positive and negated literals, thus has the power to express any Boolean function of $n$ distinct variables. To prove our results we introduce a super-multiplicative drift theorem that gives significantly stronger runtime bounds when the expected progress is only slightly super-linear in the distance from the optimum. △ Less

Submitted 13 March, 2023; originally announced March 2023.

Comments: 18 pages. Substantial text overlap with arXiv:1903.11936

Journal ref: Artificial Intelligence 319: 103906 (2023)

arXiv:2302.14420 [pdf, ps, other]

doi 10.1016/j.tcs.2024.114622

Estimation-of-Distribution Algorithms for Multi-Valued Decision Variables

Authors: Firas Ben Jedidia, Benjamin Doerr, Martin S. Krejca

Abstract: The majority of research on estimation-of-distribution algorithms (EDAs) concentrates on pseudo-Boolean optimization and permutation problems, leaving the domain of EDAs for problems in which the decision variables can take more than two values, but which are not permutation problems, mostly unexplored. To render this domain more accessible, we propose a natural way to extend the known univariate… ▽ More The majority of research on estimation-of-distribution algorithms (EDAs) concentrates on pseudo-Boolean optimization and permutation problems, leaving the domain of EDAs for problems in which the decision variables can take more than two values, but which are not permutation problems, mostly unexplored. To render this domain more accessible, we propose a natural way to extend the known univariate EDAs to this setting. Different from a naive reduction to the binary case, our approach avoids additional constraints. Since understanding genetic drift is crucial for an optimal parameter choice, we extend the known quantitative analysis of genetic drift to EDAs for multi-valued variables. Roughly speaking, when the variables take $r$ different values, the time for genetic drift to become significant is $r$ times shorter than in the binary case. Consequently, the update strength of the probabilistic model has to be chosen $r$ times lower now. To investigate how desired model updates take place in this framework, we undertake a mathematical runtime analysis on the $r$-valued \leadingones problem. We prove that with the right parameters, the multi-valued UMDA solves this problem efficiently in $O(r\ln(r)^2 n^2 \ln(n))$ function evaluations. This bound is nearly tight as our lower bound $Ω(r\ln(r) n^2 \ln(n))$ shows. Overall, our work shows that our good understanding of binary EDAs naturally extends to the multi-valued setting, and it gives advice on how to set the main parameters of multi-values EDAs. △ Less

Submitted 1 January, 2024; v1 submitted 28 February, 2023; originally announced February 2023.

Journal ref: Theoretical Computer Science 1003 (2024) 114622

arXiv:2302.12570 [pdf, other]

Lasting Diversity and Superior Runtime Guarantees for the $(μ+1)$ Genetic Algorithm

Authors: Benjamin Doerr, Aymen Echarghaoui, Mohammed Jamal, Martin S. Krejca

Abstract: Most evolutionary algorithms (EAs) used in practice employ crossover. In contrast, only for few and mostly artificial examples a runtime advantage from crossover could be proven with mathematical means. The most convincing such result shows that the $(μ+1)$ genetic algorithm (GA) with population size $μ=O(n)$ optimizes jump functions with gap size $k \ge 3$ in time $O(n^k / μ+ n^{k-1}\log n)$, bea… ▽ More Most evolutionary algorithms (EAs) used in practice employ crossover. In contrast, only for few and mostly artificial examples a runtime advantage from crossover could be proven with mathematical means. The most convincing such result shows that the $(μ+1)$ genetic algorithm (GA) with population size $μ=O(n)$ optimizes jump functions with gap size $k \ge 3$ in time $O(n^k / μ+ n^{k-1}\log n)$, beating the $Θ(n^k)$ runtime of many mutation-based EAs. This result builds on a proof that the GA occasionally and then for an expected number of $Ω(μ^2)$ iterations has a population that is not dominated by a single genotype. In this work, we show that this diversity persist with high probability for a time exponential in $μ$ (instead of quadratic). From this better understanding of the population diversity, we obtain stronger runtime guarantees, among them the statement that for all $c\ln(n)\leμ\le n/\log n$, with $c$ a suitable constant, the runtime of the $(μ+1)$ GA on $\mathrm{Jump}_k$, with $k \ge 3$, is $O(n^{k-1})$. Consequently, already with logarithmic population sizes, the GA gains a speed-up of order $Ω(n)$ from crossover. △ Less

Submitted 24 February, 2023; originally announced February 2023.

arXiv:2302.08021 [pdf, ps, other]

Fourier Analysis Meets Runtime Analysis: Precise Runtimes on Plateaus

Authors: Benjamin Doerr, Andrew James Kelley

Abstract: We propose a new method based on discrete Fourier analysis to analyze the time evolutionary algorithms spend on plateaus. This immediately gives a concise proof of the classic estimate of the expected runtime of the $(1+1)$ evolutionary algorithm on the Needle problem due to Garnier, Kallel, and Schoenauer (1999). We also use this method to analyze the runtime of the $(1+1)$ evolutionary algorit… ▽ More We propose a new method based on discrete Fourier analysis to analyze the time evolutionary algorithms spend on plateaus. This immediately gives a concise proof of the classic estimate of the expected runtime of the $(1+1)$ evolutionary algorithm on the Needle problem due to Garnier, Kallel, and Schoenauer (1999). We also use this method to analyze the runtime of the $(1+1)$ evolutionary algorithm on a new benchmark consisting of $n/\ell$ plateaus of effective size $2^\ell-1$ which have to be optimized sequentially in a LeadingOnes fashion. Using our new method, we determine the precise expected runtime both for static and fitness-dependent mutation rates. We also determine the asymptotically optimal static and fitness-dependent mutation rates. For $\ell = o(n)$, the optimal static mutation rate is approximately $1.59/n$. The optimal fitness dependent mutation rate, when the first $k$ fitness-relevant bits have been found, is asymptotically $1/(k+1)$. These results, so far only proven for the single-instance problem LeadingOnes, thus hold for a much broader class of problems. We expect similar extensions to be true for other important results on LeadingOnes. We are also optimistic that our Fourier analysis approach can be applied to other plateau problems as well. △ Less

Submitted 1 May, 2023; v1 submitted 15 February, 2023; originally announced February 2023.

Comments: 43 pages. This is the full version of a paper appearing in the proceedings of GECCO 2023. Version 3 improves notation, adds more references, and fixes a small error

arXiv:2301.11004 [pdf, other]

NLP as a Lens for Causal Analysis and Perception Mining to Infer Mental Health on Social Media

Authors: Muskan Garg, Chandni Saxena, Usman Naseem, Bonnie J Dorr

Abstract: Interactions among humans on social media often convey intentions behind their actions, yielding a psychological language resource for Mental Health Analysis (MHA) of online users. The success of Computational Intelligence Techniques (CIT) for inferring mental illness from such social media resources points to NLP as a lens for causal analysis and perception mining. However, we argue that more con… ▽ More Interactions among humans on social media often convey intentions behind their actions, yielding a psychological language resource for Mental Health Analysis (MHA) of online users. The success of Computational Intelligence Techniques (CIT) for inferring mental illness from such social media resources points to NLP as a lens for causal analysis and perception mining. However, we argue that more consequential and explainable research is required for optimal impact on clinical psychology practice and personalized mental healthcare. To bridge this gap, we posit two significant dimensions: (1) Causal analysis to illustrate a cause and effect relationship in the user generated text; (2) Perception mining to infer psychological perspectives of social effects on online users intentions. Within the scope of Natural Language Processing (NLP), we further explore critical areas of inquiry associated with these two dimensions, specifically through recent advancements in discourse analysis. This position paper guides the community to explore solutions in this space and advance the state of practice in develo** conversational agents for inferring mental health from social media. We advocate for a more explainable approach toward modeling computational psychology problems through the lens of language as we observe an increased number of research contributions in dataset and problem formulation for causal relation extraction and perception enhancements while inferring mental states. △ Less

Submitted 22 August, 2023; v1 submitted 26 January, 2023; originally announced January 2023.

arXiv:2301.01319 [pdf, other]

The ReSWARM Microgravity Flight Experiments: Planning, Control, and Model Estimation for On-Orbit Close Proximity Operations

Authors: Bryce Doerr, Keenan Albee, Monica Ekal, Rodrigo Ventura, Richard Linares

Abstract: On-orbit close proximity operations involve robotic spacecraft maneuvering and making decisions for a growing number of mission scenarios demanding autonomy, including on-orbit assembly, repair, and astronaut assistance. Of these scenarios, on-orbit assembly is an enabling technology that will allow large space structures to be built in-situ, using smaller building block modules. However, robotic… ▽ More On-orbit close proximity operations involve robotic spacecraft maneuvering and making decisions for a growing number of mission scenarios demanding autonomy, including on-orbit assembly, repair, and astronaut assistance. Of these scenarios, on-orbit assembly is an enabling technology that will allow large space structures to be built in-situ, using smaller building block modules. However, robotic on-orbit assembly involves a number of technical hurdles such as changing system models. For instance, grappled modules moved by a free-flying "assembler" robot can cause significant shifts in system inertial properties, which has cascading impacts on motion planning and control portions of the autonomy stack. Further, on-orbit assembly and other scenarios require collision-avoiding motion planning, particularly when operating in a "construction site" scenario of multiple assembler robots and structures. These complicating factors, relevant to many autonomous microgravity robotics use cases, are tackled in the ReSWARM flight experiments as a set of tests on the International Space Station using NASA's Astrobee robots. RElative Satellite sWarming and Robotic Maneuvering, or ReSWARM, demonstrates multiple key technologies for close proximity operations and on-orbit assembly: (1) global long-horizon planning, accomplished using offline and online sampling-based planner options that consider the system dynamics; (2) on-orbit reconfiguration model learning, using the recently-proposed RATTLE information-aware planning framework; and (3) robust control tools to provide low-level control robustness using current system knowledge. These approaches are detailed individually and in an "on-orbit assembly scenario" of multi-waypoint tracking on-orbit. Additionally, detail is provided discussing the practicalities of hardware implementation and unique aspects of working with Astrobee in microgravity. △ Less

Submitted 16 July, 2023; v1 submitted 3 January, 2023; originally announced January 2023.

arXiv:2211.13084 [pdf, ps, other]

doi 10.1109/TEVC.2023.3320278

Runtime Analysis for the NSGA-II: Proving, Quantifying, and Explaining the Inefficiency For Many Objectives

Authors: Weijie Zheng, Benjamin Doerr

Abstract: The NSGA-II is one of the most prominent algorithms to solve multi-objective optimization problems. Despite numerous successful applications, several studies have shown that the NSGA-II is less effective for larger numbers of objectives. In this work, we use mathematical runtime analyses to rigorously demonstrate and quantify this phenomenon. We show that even on the simple $m$-objective generaliz… ▽ More The NSGA-II is one of the most prominent algorithms to solve multi-objective optimization problems. Despite numerous successful applications, several studies have shown that the NSGA-II is less effective for larger numbers of objectives. In this work, we use mathematical runtime analyses to rigorously demonstrate and quantify this phenomenon. We show that even on the simple $m$-objective generalization of the discrete OneMinMax benchmark, where every solution is Pareto optimal, the NSGA-II also with large population sizes cannot compute the full Pareto front (objective vectors of all Pareto optima) in sub-exponential time when the number of objectives is at least three. The reason for this unexpected behavior lies in the fact that in the computation of the crowding distance, the different objectives are regarded independently. This is not a problem for two objectives, where any sorting of a pair-wise incomparable set of solutions according to one objective is also such a sorting according to the other objective (in the inverse order). △ Less

Submitted 26 September, 2023; v1 submitted 23 November, 2022; originally announced November 2022.

Comments: Accepted for publication in "IEEE Transactions on Evolutionary Computation"

arXiv:2211.08202 [pdf, other]

doi 10.24963/ijcai.2023/628

A Mathematical Runtime Analysis of the Non-dominated Sorting Genetic Algorithm III (NSGA-III)

Authors: Simon Wietheger, Benjamin Doerr

Abstract: The Non-dominated Sorting Genetic Algorithm II (NSGA-II) is the most prominent multi-objective evolutionary algorithm for real-world applications. While it performs evidently well on bi-objective optimization problems, empirical studies suggest that it is less effective when applied to problems with more than two objectives. A recent mathematical runtime analysis confirmed this observation by prov… ▽ More The Non-dominated Sorting Genetic Algorithm II (NSGA-II) is the most prominent multi-objective evolutionary algorithm for real-world applications. While it performs evidently well on bi-objective optimization problems, empirical studies suggest that it is less effective when applied to problems with more than two objectives. A recent mathematical runtime analysis confirmed this observation by proving the NGSA-II for an exponential number of iterations misses a constant factor of the Pareto front of the simple 3-objective OneMinMax problem. In this work, we provide the first mathematical runtime analysis of the NSGA-III, a refinement of the NSGA-II aimed at better handling more than two objectives. We prove that the NSGA-III with sufficiently many reference points -- a small constant factor more than the size of the Pareto front, as suggested for this algorithm -- computes the complete Pareto front of the 3-objective OneMinMax benchmark in an expected number of O(n log n) iterations. This result holds for all population sizes (that are at least the size of the Pareto front). It shows a drastic advantage of the NSGA-III over the NSGA-II on this benchmark. The mathematical arguments used here and in previous work on the NSGA-II suggest that similar findings are likely for other benchmarks with three or more objectives. △ Less

Submitted 12 June, 2023; v1 submitted 15 November, 2022; originally announced November 2022.

Comments: Long version of a paper appearing at IJCAI 2023

Journal ref: Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, Main Track. Pages 5657-5665. 2023

arXiv:2210.03618 [pdf, other]

doi 10.1145/3512290.3528868

The $(1+(λ,λ))$ Global SEMO Algorithm

Authors: Benjamin Doerr, Omar El Hadri, Adrien Pinard

Abstract: The $(1+(λ,λ))$ genetic algorithm is a recently proposed single-objective evolutionary algorithm with several interesting properties. We show that its main working principle, mutation with a high rate and crossover as repair mechanism, can be transported also to multi-objective evolutionary computation. We define the $(1+(λ,λ))$ global SEMO algorithm, a variant of the classic global SEMO algorithm… ▽ More The $(1+(λ,λ))$ genetic algorithm is a recently proposed single-objective evolutionary algorithm with several interesting properties. We show that its main working principle, mutation with a high rate and crossover as repair mechanism, can be transported also to multi-objective evolutionary computation. We define the $(1+(λ,λ))$ global SEMO algorithm, a variant of the classic global SEMO algorithm, and prove that it optimizes the OneMinMax benchmark asymptotically faster than the global SEMO. Following the single-objective example, we design a one-fifth rule inspired dynamic parameter setting (to the best of our knowledge for the first time in discrete multi-objective optimization) and prove that it further improves the runtime to $O(n^2)$, whereas the best runtime guarantee for the global SEMO is only $O(n^2 \log n)$. △ Less

Submitted 7 October, 2022; originally announced October 2022.

Comments: Author generated version of a paper at GECCO 2022

Journal ref: The (1 + (λ, λ)) global SEMO algorithm. GECCO 2022: 520-528. ACM

arXiv:2209.13974 [pdf, other]

From Understanding the Population Dynamics of the NSGA-II to the First Proven Lower Bounds

Authors: Benjamin Doerr, Zhongdi Qu

Abstract: Due to the more complicated population dynamics of the NSGA-II, none of the existing runtime guarantees for this algorithm is accompanied by a non-trivial lower bound. Via a first mathematical understanding of the population dynamics of the NSGA-II, that is, by estimating the expected number of individuals having a certain objective value, we prove that the NSGA-II with suitable population size ne… ▽ More Due to the more complicated population dynamics of the NSGA-II, none of the existing runtime guarantees for this algorithm is accompanied by a non-trivial lower bound. Via a first mathematical understanding of the population dynamics of the NSGA-II, that is, by estimating the expected number of individuals having a certain objective value, we prove that the NSGA-II with suitable population size needs $Ω(Nn\log n)$ function evaluations to find the Pareto front of the OneMinMax problem and $Ω(Nn^k)$ evaluations on the OneJumpZeroJump problem with jump size $k$. These bounds are asymptotically tight (that is, they match previously shown upper bounds) and show that the NSGA-II here does not even in terms of the parallel runtime (number of iterations) profit from larger population sizes. For the OneJumpZeroJump problem and when the same sorting is used for the computation of the crowding distance contributions of the two objectives, we even obtain a runtime estimate that is tight including the leading constant. △ Less

Submitted 10 March, 2023; v1 submitted 28 September, 2022; originally announced September 2022.

Comments: Extended version of a paper that appears in the proceedings of AAAI 2023

arXiv:2208.08759 [pdf, other]

Runtime Analysis for the NSGA-II: Provable Speed-Ups From Crossover

Authors: Benjamin Doerr, Zhongdi Qu

Abstract: Very recently, the first mathematical runtime analyses for the NSGA-II, the most common multi-objective evolutionary algorithm, have been conducted. Continuing this research direction, we prove that the NSGA-II optimizes the OneJumpZeroJump benchmark asymptotically faster when crossover is employed. Together with a parallel independent work by Dang, Opris, Salehi, and Sudholt, this is the first ti… ▽ More Very recently, the first mathematical runtime analyses for the NSGA-II, the most common multi-objective evolutionary algorithm, have been conducted. Continuing this research direction, we prove that the NSGA-II optimizes the OneJumpZeroJump benchmark asymptotically faster when crossover is employed. Together with a parallel independent work by Dang, Opris, Salehi, and Sudholt, this is the first time such an advantage of crossover is proven for the NSGA-II. Our arguments can be transferred to single-objective optimization. They then prove that crossover can speed up the $(μ+1)$ genetic algorithm in a different way and more pronounced than known before. Our experiments confirm the added value of crossover and show that the observed advantages are even larger than what our proofs can guarantee. △ Less

Submitted 15 March, 2023; v1 submitted 18 August, 2022; originally announced August 2022.

Comments: Extended version of a paper that appears in the proceedings of AAAI 2023

arXiv:2207.04674 [pdf, other]

CAMS: An Annotated Corpus for Causal Analysis of Mental Health Issues in Social Media Posts

Authors: Muskan Garg, Chandni Saxena, Veena Krishnan, Ruchi Joshi, Sriparna Saha, Vijay Mago, Bonnie J Dorr

Abstract: Research community has witnessed substantial growth in the detection of mental health issues and their associated reasons from analysis of social media. We introduce a new dataset for Causal Analysis of Mental health issues in Social media posts (CAMS). Our contributions for causal analysis are two-fold: causal interpretation and causal categorization. We introduce an annotation schema for this ta… ▽ More Research community has witnessed substantial growth in the detection of mental health issues and their associated reasons from analysis of social media. We introduce a new dataset for Causal Analysis of Mental health issues in Social media posts (CAMS). Our contributions for causal analysis are two-fold: causal interpretation and causal categorization. We introduce an annotation schema for this task of causal analysis. We demonstrate the efficacy of our schema on two different datasets: (i) crawling and annotating 3155 Reddit posts and (ii) re-annotating the publicly available SDCNL dataset of 1896 instances for interpretable causal analysis. We further combine these into the CAMS dataset and make this resource publicly available along with associated source code: https://github.com/drmuskangarg/CAMS. We present experimental results of models learned from CAMS dataset and demonstrate that a classic Logistic Regression model outperforms the next best (CNN-LSTM) model by 4.9\% accuracy. △ Less

Submitted 11 July, 2022; originally announced July 2022.

Comments: 10 pages

Report number: 6387--6396

Journal ref: Proceedings of the Thirteenth Language Resources and Evaluation Conference, LREC 2022

arXiv:2207.04045 [pdf, other]

doi 10.1007/s00453-023-01146-8

Runtime Analysis for Permutation-based Evolutionary Algorithms

Authors: Benjamin Doerr, Yassine Ghannane, Marouane Ibn Brahim

Abstract: While the theoretical analysis of evolutionary algorithms (EAs) has made significant progress for pseudo-Boolean optimization problems in the last 25 years, only sporadic theoretical results exist on how EAs solve permutation-based problems. To overcome the lack of permutation-based benchmark problems, we propose a general way to transfer the classic pseudo-Boolean benchmarks into benchmarks def… ▽ More While the theoretical analysis of evolutionary algorithms (EAs) has made significant progress for pseudo-Boolean optimization problems in the last 25 years, only sporadic theoretical results exist on how EAs solve permutation-based problems. To overcome the lack of permutation-based benchmark problems, we propose a general way to transfer the classic pseudo-Boolean benchmarks into benchmarks defined on sets of permutations. We then conduct a rigorous runtime analysis of the permutation-based $(1+1)$ EA proposed by Scharnow, Tinnefeld, and Wegener (2004) on the analogues of the LeadingOnes and Jump benchmarks. The latter shows that, different from bit-strings, it is not only the Hamming distance that determines how difficult it is to mutate a permutation $σ$ into another one $τ$, but also the precise cycle structure of $στ^{-1}$. For this reason, we also regard the more symmetric scramble mutation operator. We observe that it not only leads to simpler proofs, but also reduces the runtime on jump functions with odd jump size by a factor of $Θ(n)$. Finally, we show that a heavy-tailed version of the scramble operator, as in the bit-string case, leads to a speed-up of order $m^{Θ(m)}$ on jump functions with jump size $m$. A short empirical analysis confirms these findings, but also reveals that small implementation details like the rate of void mutations can make an important difference. △ Less

Submitted 20 April, 2024; v1 submitted 5 July, 2022; originally announced July 2022.

Comments: Journal version of our paper at GECCO 2022, appeared in Algorithmica. 52 pages. arXiv admin note: substantial text overlap with arXiv:2204.07637

Journal ref: Algorithmica 86(1): 90-129 (2024)

arXiv:2206.11198 [pdf, ps, other]

doi 10.1007/978-3-031-14721-0\_33

General Univariate Estimation-of-Distribution Algorithms

Authors: Benjamin Doerr, Marc Dufay

Abstract: We propose a general formulation of a univariate estimation-of-distribution algorithm (EDA). It naturally incorporates the three classic univariate EDAs \emph{compact genetic algorithm}, \emph{univariate marginal distribution algorithm} and \emph{population-based incremental learning} as well as the \emph{max-min ant system} with iteration-best update. Our unified description of the existing algor… ▽ More We propose a general formulation of a univariate estimation-of-distribution algorithm (EDA). It naturally incorporates the three classic univariate EDAs \emph{compact genetic algorithm}, \emph{univariate marginal distribution algorithm} and \emph{population-based incremental learning} as well as the \emph{max-min ant system} with iteration-best update. Our unified description of the existing algorithms allows a unified analysis of these; we demonstrate this by providing an analysis of genetic drift that immediately gives the existing results proven separately for the four algorithms named above. Our general model also includes EDAs that are more efficient than the existing ones and these may not be difficult to find as we demonstrate for the OneMax and LeadingOnes benchmarks. △ Less

Submitted 6 October, 2022; v1 submitted 22 June, 2022; originally announced June 2022.

Comments: Conference version with missing proofs in the appendix

Journal ref: General Univariate Estimation-of-Distribution Algorithms. PPSN (2) 2022: 470-484

arXiv:2206.09090 [pdf, ps, other]

From Understanding Genetic Drift to a Smart-Restart Mechanism for Estimation-of-Distribution Algorithms

Authors: Weijie Zheng, Benjamin Doerr

Abstract: Estimation-of-distribution algorithms (EDAs) are optimization algorithms that learn a distribution on the search space from which good solutions can be sampled easily. A key parameter of most EDAs is the sample size (population size). If the population size is too small, the update of the probabilistic model builds on few samples, leading to the undesired effect of genetic drift. Too large populat… ▽ More Estimation-of-distribution algorithms (EDAs) are optimization algorithms that learn a distribution on the search space from which good solutions can be sampled easily. A key parameter of most EDAs is the sample size (population size). If the population size is too small, the update of the probabilistic model builds on few samples, leading to the undesired effect of genetic drift. Too large population sizes avoid genetic drift, but slow down the process. Building on a recent quantitative analysis of how the population size leads to genetic drift, we design a smart-restart mechanism for EDAs. By stop** runs when the risk for genetic drift is high, it automatically runs the EDA in good parameter regimes. Via a mathematical runtime analysis, we prove a general performance guarantee for this smart-restart scheme. This in particular shows that in many situations where the optimal (problem-specific) parameter values are known, the restart scheme automatically finds these, leading to the asymptotically optimal performance. We also conduct an extensive experimental analysis. On four classic benchmark problems, we clearly observe the critical influence of the population size on the performance, and we find that the smart-restart scheme leads to a performance close to the one obtainable with optimal parameter values. Our results also show that previous theory-based suggestions for the optimal population size can be far from the optimal ones, leading to a performance clearly inferior to the one obtained via the smart-restart scheme. We also conduct experiments with PBIL (cross-entropy algorithm) on two combinatorial optimization problems from the literature, the max-cut problem and the bipartition problem. Again, we observe that the smart-restart mechanism finds much better values for the population size than those suggested in the literature, leading to a much better performance. △ Less

Submitted 3 November, 2023; v1 submitted 17 June, 2022; originally announced June 2022.

Comments: Extended version of our GECCO 2020 paper. This article supersedes arXiv:2004.07141

Journal ref: Journal of Machine Learning Research 24 (2023) 1-40

arXiv:2205.03670 [pdf, other]

doi 10.1145/3512290.3528825

Automated Algorithm Selection for Radar Network Configuration

Authors: Quentin Renau, Johann Dreo, Alain Peres, Yann Semet, Carola Doerr, Benjamin Doerr

Abstract: The configuration of radar networks is a complex problem that is often performed manually by experts with the help of a simulator. Different numbers and types of radars as well as different locations that the radars shall cover give rise to different instances of the radar configuration problem. The exact modeling of these instances is complex, as the quality of the configurations depends on a lar… ▽ More The configuration of radar networks is a complex problem that is often performed manually by experts with the help of a simulator. Different numbers and types of radars as well as different locations that the radars shall cover give rise to different instances of the radar configuration problem. The exact modeling of these instances is complex, as the quality of the configurations depends on a large number of parameters, on internal radar processing, and on the terrains on which the radars need to be placed. Classic optimization algorithms can therefore not be applied to this problem, and we rely on "trial-and-error" black-box approaches. In this paper, we study the performances of 13 black-box optimization algorithms on 153 radar network configuration problem instances. The algorithms perform considerably better than human experts. Their ranking, however, depends on the budget of configurations that can be evaluated and on the elevation profile of the location. We therefore also investigate automated algorithm selection approaches. Our results demonstrate that a pipeline that extracts instance features from the elevation of the terrain performs on par with the classical, far more expensive approach that extracts features from the objective function. △ Less

Submitted 22 April, 2023; v1 submitted 7 May, 2022; originally announced May 2022.

Comments: Author-generated version of a paper in the proceedings of The Genetic and Evolutionary Computation Conference 2022 (GECCO 2022)

Journal ref: Automated algorithm selection for radar network configuration. GECCO 2022: 1263-1271

arXiv:2204.13750 [pdf, ps, other]

doi 10.1109/TEVC.2023.3250552

A First Runtime Analysis of the NSGA-II on a Multimodal Problem

Authors: Benjamin Doerr, Zhongdi Qu

Abstract: Very recently, the first mathematical runtime analyses of the multi-objective evolutionary optimizer NSGA-II have been conducted. We continue this line of research with a first runtime analysis of this algorithm on a benchmark problem consisting of two multimodal objectives. We prove that if the population size $N$ is at least four times the size of the Pareto front, then the NSGA-II with four dif… ▽ More Very recently, the first mathematical runtime analyses of the multi-objective evolutionary optimizer NSGA-II have been conducted. We continue this line of research with a first runtime analysis of this algorithm on a benchmark problem consisting of two multimodal objectives. We prove that if the population size $N$ is at least four times the size of the Pareto front, then the NSGA-II with four different ways to select parents and bit-wise mutation optimizes the OneJumpZeroJump benchmark with jump size~$2 \le k \le n/4$ in time $O(N n^k)$. When using fast mutation, a recently proposed heavy-tailed mutation operator, this guarantee improves by a factor of $k^{Ω(k)}$. Overall, this work shows that the NSGA-II copes with the local optima of the OneJumpZeroJump problem at least as well as the global SEMO algorithm. △ Less

Submitted 4 January, 2024; v1 submitted 28 April, 2022; originally announced April 2022.

Comments: Appeared in the Transactions on Evolutionary Computation. Extends a paper that appeared in the Proceedings of PPSN 2022

Journal ref: IEEE Transactions on Evolutionary Computation 27(5): 1288-1297 (2023)

arXiv:2204.07637 [pdf, ps, other]

doi 10.1145/3512290.3528720

Towards a Stronger Theory for Permutation-based Evolutionary Algorithms

Authors: Benjamin Doerr, Yassine Ghannane, Marouane Ibn Brahim

Abstract: While the theoretical analysis of evolutionary algorithms (EAs) has made significant progress for pseudo-Boolean optimization problems in the last 25 years, only sporadic theoretical results exist on how EAs solve permutation-based problems. To overcome the lack of permutation-based benchmark problems, we propose a general way to transfer the classic pseudo-Boolean benchmarks into benchmarks def… ▽ More While the theoretical analysis of evolutionary algorithms (EAs) has made significant progress for pseudo-Boolean optimization problems in the last 25 years, only sporadic theoretical results exist on how EAs solve permutation-based problems. To overcome the lack of permutation-based benchmark problems, we propose a general way to transfer the classic pseudo-Boolean benchmarks into benchmarks defined on sets of permutations. We then conduct a rigorous runtime analysis of the permutation-based $(1+1)$ EA proposed by Scharnow, Tinnefeld, and Wegener (2004) on the analogues of the \textsc{LeadingOnes} and \textsc{Jump} benchmarks. The latter shows that, different from bit-strings, it is not only the Hamming distance that determines how difficult it is to mutate a permutation $σ$ into another one $τ$, but also the precise cycle structure of $στ^{-1}$. For this reason, we also regard the more symmetric scramble mutation operator. We observe that it not only leads to simpler proofs, but also reduces the runtime on jump functions with odd jump size by a factor of $Θ(n)$. Finally, we show that a heavy-tailed version of the scramble operator, as in the bit-string case, leads to a speed-up of order $m^{Θ(m)}$ on jump functions with jump size~$m$.% △ Less

Submitted 6 October, 2022; v1 submitted 15 April, 2022; originally announced April 2022.

Comments: Conference version with an appendix containing the proofs omitted for reasons of space

Journal ref: GECCO 2022: 1390-1398

arXiv:2204.02097 [pdf, ps, other]

doi 10.1007/s00453-023-01135-x

Simulated Annealing is a Polynomial-Time Approximation Scheme for the Minimum Spanning Tree Problem

Authors: Benjamin Doerr, Amirhossein Rajabi, Carsten Witt

Abstract: We prove that Simulated Annealing with an appropriate cooling schedule computes arbitrarily tight constant-factor approximations to the minimum spanning tree problem in polynomial time. This result was conjectured by Wegener (2005). More precisely, denoting by $n, m, w_{\max}$, and $w_{\min}$ the number of vertices and edges as well as the maximum and minimum edge weight of the MST instance, we pr… ▽ More We prove that Simulated Annealing with an appropriate cooling schedule computes arbitrarily tight constant-factor approximations to the minimum spanning tree problem in polynomial time. This result was conjectured by Wegener (2005). More precisely, denoting by $n, m, w_{\max}$, and $w_{\min}$ the number of vertices and edges as well as the maximum and minimum edge weight of the MST instance, we prove that simulated annealing with initial temperature $T_0 \ge w_{\max}$ and multiplicative cooling schedule with factor $1-1/\ell$, where $\ell = ω(mn\ln(m))$, with probability at least $1-1/m$ computes in time $O(\ell (\ln\ln (\ell) + \ln(T_0/w_{\min}) ))$ a spanning tree with weight at most $1+κ$ times the optimum weight, where $1+κ= \frac{(1+o(1))\ln(\ell m)}{\ln(\ell) -\ln (mn\ln (m))}$. Consequently, for any $ε>0$, we can choose $\ell$ in such a way that a $(1+ε)$-approximation is found in time $O((mn\ln(n))^{1+1/ε+o(1)}(\ln\ln n + \ln(T_0/w_{\min})))$ with probability at least $1-1/m$. In the special case of so-called $(1+ε)$-separated weights, this algorithm computes an optimal solution (again in time $O( (mn\ln(n))^{1+1/ε+o(1)}(\ln\ln n + \ln(T_0/w_{\min})))$), which is a significant speed-up over Wegener's runtime guarantee of $O(m^{8 + 8/ε})$. △ Less

Submitted 22 July, 2023; v1 submitted 5 April, 2022; originally announced April 2022.

Comments: 19 pages. Extended version of a paper at GECCO 2022. This version is accepted for publication in Algorithmica

Journal ref: Simulated annealing is a polynomial-time approximation scheme for the minimum spanning tree problem. Algorithmica. 2023

arXiv:2203.10659 [pdf, other]

From Stance to Concern: Adaptation of Propositional Analysis to New Tasks and Domains

Authors: Brodie Mather, Bonnie J Dorr, Adam Dalton, William de Beaumont, Owen Rambow, Sonja M. Schmer-Galunder

Abstract: We present a generalized paradigm for adaptation of propositional analysis (predicate-argument pairs) to new tasks and domains. We leverage an analogy between stances (belief-driven sentiment) and concerns (topical issues with moral dimensions/endorsements) to produce an explanatory representation. A key contribution is the combination of semi-automatic resource building for extraction of domain-d… ▽ More We present a generalized paradigm for adaptation of propositional analysis (predicate-argument pairs) to new tasks and domains. We leverage an analogy between stances (belief-driven sentiment) and concerns (topical issues with moral dimensions/endorsements) to produce an explanatory representation. A key contribution is the combination of semi-automatic resource building for extraction of domain-dependent concern types (with 2-4 hours of human labor per domain) and an entirely automatic procedure for extraction of domain-independent moral dimensions and endorsement values. Prudent (automatic) selection of terms from propositional structures for lexical expansion (via semantic similarity) produces new moral dimension lexicons at three levels of granularity beyond a strong baseline lexicon. We develop a ground truth (GT) based on expert annotators and compare our concern detection output to GT, to yield 231% improvement in recall over baseline, with only a 10% loss in precision. F1 yields 66% improvement over baseline and 97.8% of human performance. Our lexically based approach yields large savings over approaches that employ costly human labor and model building. We provide to the community a newly expanded moral dimension/value lexicon, annotation guidelines, and GT. △ Less

Submitted 20 March, 2022; originally announced March 2022.

Comments: Accepted to Findings of the Association for Computational Linguistics, 2022

MSC Class: 68T50 ACM Class: I.2.7

arXiv:2203.02693 [pdf, ps, other]

Approximation Guarantees for the Non-Dominated Sorting Genetic Algorithm II (NSGA-II)

Authors: Weijie Zheng, Benjamin Doerr

Abstract: Recent theoretical works have shown that the NSGA-II efficiently computes the full Pareto front when the population size is large enough. In this work, we study how well it approximates the Pareto front when the population size is smaller. For the OneMinMax benchmark, we point out situations in which the parents and offspring cover well the Pareto front, but the next population has large gaps on… ▽ More Recent theoretical works have shown that the NSGA-II efficiently computes the full Pareto front when the population size is large enough. In this work, we study how well it approximates the Pareto front when the population size is smaller. For the OneMinMax benchmark, we point out situations in which the parents and offspring cover well the Pareto front, but the next population has large gaps on the Pareto front. Our mathematical proofs suggest as reason for this undesirable behavior that the NSGA-II in the selection stage computes the crowding distance once and then removes individuals with smallest crowding distance without considering that a removal increases the crowding distance of some individuals. We then analyze two variants not prone to this problem. For the NSGA-II that updates the crowding distance after each removal (Kukkonen and Deb (2006)) and the steady-state NSGA-II (Nebro and Durillo (2009)), we prove that the gaps in the Pareto front are never more than a small constant factor larger than the theoretical minimum. This is the first mathematical work on the approximation ability of the NSGA-II and the first runtime analysis for the steady-state NSGA-II. Experiments also show the superior approximation ability of the two NSGA-II variants. △ Less

Submitted 1 October, 2023; v1 submitted 5 March, 2022; originally announced March 2022.

Comments: Extended version of our GECCO 2022 paper

arXiv:2201.12158 [pdf, other]

doi 10.1162/evco\_a\_00313

Stagnation Detection Meets Fast Mutation

Authors: Benjamin Doerr, Amirhossein Rajabi

Abstract: Two mechanisms have recently been proposed that can significantly speed up finding distant improving solutions via mutation, namely using a random mutation rate drawn from a heavy-tailed distribution ("fast mutation", Doerr et al. (2017)) and increasing the mutation strength based on stagnation detection (Rajabi and Witt (2020)). Whereas the latter can obtain the asymptotically best probability of… ▽ More Two mechanisms have recently been proposed that can significantly speed up finding distant improving solutions via mutation, namely using a random mutation rate drawn from a heavy-tailed distribution ("fast mutation", Doerr et al. (2017)) and increasing the mutation strength based on stagnation detection (Rajabi and Witt (2020)). Whereas the latter can obtain the asymptotically best probability of finding a single desired solution in a given distance, the former is more robust and performs much better when many improving solutions in some distance exist. In this work, we propose a mutation strategy that combines ideas of both mechanisms. We show that it can also obtain the best possible probability of finding a single distant solution. However, when several improving solutions exist, it can outperform both the stagnation-detection approach and fast mutation. The new operator is more than an interleaving of the two previous mechanisms and it also outperforms any such interleaving. △ Less

Submitted 3 May, 2022; v1 submitted 28 January, 2022; originally announced January 2022.

Comments: 28 pages. Full version of a paper appearing at EvoCOP 2022

Journal ref: Theoretical Computer Science 946: 113670 (2023)

arXiv:2112.08581 [pdf, ps, other]

doi 10.1016/j.artint.2023.104016

Mathematical Runtime Analysis for the Non-Dominated Sorting Genetic Algorithm II (NSGA-II)

Authors: Weijie Zheng, Benjamin Doerr

Abstract: The non-dominated sorting genetic algorithm II (NSGA-II) is the most intensively used multi-objective evolutionary algorithm (MOEA) in real-world applications. However, in contrast to several simple MOEAs analyzed also via mathematical means, no such study exists for the NSGA-II so far. In this work, we show that mathematical runtime analyses are feasible also for the NSGA-II. As particular result… ▽ More The non-dominated sorting genetic algorithm II (NSGA-II) is the most intensively used multi-objective evolutionary algorithm (MOEA) in real-world applications. However, in contrast to several simple MOEAs analyzed also via mathematical means, no such study exists for the NSGA-II so far. In this work, we show that mathematical runtime analyses are feasible also for the NSGA-II. As particular results, we prove that with a population size four times larger than the size of the Pareto front, the NSGA-II with two classic mutation operators and four different ways to select the parents satisfies the same asymptotic runtime guarantees as the SEMO and GSEMO algorithms on the basic OneMinMax and LeadingOnesTrailingZeros benchmarks. However, if the population size is only equal to the size of the Pareto front, then the NSGA-II cannot efficiently compute the full Pareto front: for an exponential number of iterations, the population will always miss a constant fraction of the Pareto front. Our experiments confirm the above findings. △ Less

Submitted 10 October, 2023; v1 submitted 15 December, 2021; originally announced December 2021.

Comments: Journal version of the paper "Weijie Zheng, Yufei Liu, Benjamin Doerr: A First Mathematical Runtime Analysis of the Non-Dominated Sorting Genetic Algorithm II (NSGA-II). AAAI 2022. arXiv:2112.08581v3"

Journal ref: Artificial Intelligence 325 (2023), 104016

arXiv:2109.06584 [pdf, ps, other]

doi 10.1016/j.ic.2023.105125

Choosing the Right Algorithm With Hints From Complexity Theory

Authors: Shouda Wang, Weijie Zheng, Benjamin Doerr

Abstract: Choosing a suitable algorithm from the myriads of different search heuristics is difficult when faced with a novel optimization problem. In this work, we argue that the purely academic question of what could be the best possible algorithm in a certain broad class of black-box optimizers can give fruitful indications in which direction to search for good established optimization heuristics. We demo… ▽ More Choosing a suitable algorithm from the myriads of different search heuristics is difficult when faced with a novel optimization problem. In this work, we argue that the purely academic question of what could be the best possible algorithm in a certain broad class of black-box optimizers can give fruitful indications in which direction to search for good established optimization heuristics. We demonstrate this approach on the recently proposed DLB benchmark, for which the only known results are $O(n^3)$ runtimes for several classic evolutionary algorithms and an $O(n^2 \log n)$ runtime for an estimation-of-distribution algorithm. Our finding that the unary unbiased black-box complexity is only $O(n^2)$ suggests the Metropolis algorithm as an interesting candidate and we prove that it solves the DLB problem in quadratic time. Since we also prove that better runtimes cannot be obtained in the class of unary unbiased algorithms, we shift our attention to algorithms that use the information of more parents to generate new solutions. An artificial algorithm of this type having an $O(n \log n)$ runtime leads to the result that the significance-based compact genetic algorithm (sig-cGA) can solve the DLB problem also in time $O(n \log n)$ with high probability. Our experiments show a remarkably good performance of the Metropolis algorithm, clearly the best of all algorithms regarded for reasonable problem sizes. △ Less

Submitted 26 November, 2023; v1 submitted 14 September, 2021; originally announced September 2021.

Comments: 1 Figure. Journal version of a paper appearing at IJCAI 2021

Journal ref: Information and Computation 296, 105125, 2024

arXiv:2105.03090 [pdf, other]

doi 10.1007/s00453-022-00977-1

An Extended Jump Functions Benchmark for the Analysis of Randomized Search Heuristics

Authors: Henry Bambury, Antoine Bultel, Benjamin Doerr

Abstract: Jump functions are the {most-studied} non-unimodal benchmark in the theory of randomized search heuristics, in particular, evolutionary algorithms (EAs). They have significantly improved our understanding of how EAs escape from local optima. However, their particular structure -- to leave the local optimum one can only jump directly to the global optimum -- raises the question of how representativ… ▽ More Jump functions are the {most-studied} non-unimodal benchmark in the theory of randomized search heuristics, in particular, evolutionary algorithms (EAs). They have significantly improved our understanding of how EAs escape from local optima. However, their particular structure -- to leave the local optimum one can only jump directly to the global optimum -- raises the question of how representative such results are. For this reason, we propose an extended class $\textsc{Jump}_{k,δ}$ of jump functions that contain a valley of low fitness of width $δ$ starting at distance $k$ from the global optimum. We prove that several previous results extend to this more general class: for all {$k \le \frac{n^{1/3}}{\ln{n}}$} and $δ< k$, the optimal mutation rate for the $(1+1)$~EA is $\fracδ{n}$, and the fast $(1+1)$~EA runs faster than the classical $(1+1)$~EA by a factor super-exponential in $δ$. However, we also observe that some known results do not generalize: the randomized local search algorithm with stagnation detection, which is faster than the fast $(1+1)$~EA by a factor polynomial in $k$ on $\textsc{Jump}_k$, is slower by a factor polynomial in $n$ on some $\textsc{Jump}_{k,δ}$ instances. Computationally, the new class allows experiments with wider fitness valleys, especially when they lie further away from the global optimum. △ Less

Submitted 28 April, 2022; v1 submitted 7 May, 2021; originally announced May 2021.

Comments: Extended version of a paper that appeared in the proceedings of GECCO 2021. To appear in Algorithmica

arXiv:2104.10799 [pdf, ps, other]

doi 10.1016/j.jco.2021.101589

On Negative Dependence Properties of Latin Hypercube Samples and Scrambled Nets

Authors: Benjamin Doerr, Michael Gnewuch

Abstract: We study the notion of $γ$-negative dependence of random variables. This notion is a relaxation of the notion of negative orthant dependence (which corresponds to $1$-negative dependence), but nevertheless it still ensures concentration of measure and allows to use large deviation bounds of Chernoff-Hoeffding- or Bernstein-type. We study random variables based on random points $P$. These random va… ▽ More We study the notion of $γ$-negative dependence of random variables. This notion is a relaxation of the notion of negative orthant dependence (which corresponds to $1$-negative dependence), but nevertheless it still ensures concentration of measure and allows to use large deviation bounds of Chernoff-Hoeffding- or Bernstein-type. We study random variables based on random points $P$. These random variables appear naturally in the analysis of the discrepancy of $P$ or, equivalently, of a suitable worst-case integration error of the quasi-Monte Carlo cubature that uses the points in $P$ as integration nodes. We introduce the correlation number, which is the smallest possible value of $γ$ that ensures $γ$-negative dependence. We prove that the random variables of interest based on Latin hypercube sampling or on $(t,m,d)$-nets do, in general, not have a correlation number of $1$, i.e., they are not negative orthant dependent. But it is known that the random variables based on Latin hypercube sampling in dimension $d$ are actually $γ$-negatively dependent with $γ\le e^d$, and the resulting probabilistic discrepancy bounds do only mildly depend on the $γ$-value. △ Less

Submitted 28 June, 2021; v1 submitted 21 April, 2021; originally announced April 2021.

Journal ref: Journal of Complexity, Vol. 67 (2021), 101589

arXiv:2104.06714 [pdf, ps, other]

doi 10.1007/s00453-023-01098-z

Lazy Parameter Tuning and Control: Choosing All Parameters Randomly From a Power-Law Distribution

Authors: Denis Antipov, Maxim Buzdalov, Benjamin Doerr

Abstract: Most evolutionary algorithms have multiple parameters and their values drastically affect the performance. Due to the often complicated interplay of the parameters, setting these values right for a particular problem (parameter tuning) is a challenging task. This task becomes even more complicated when the optimal parameter values change significantly during the run of the algorithm since then a d… ▽ More Most evolutionary algorithms have multiple parameters and their values drastically affect the performance. Due to the often complicated interplay of the parameters, setting these values right for a particular problem (parameter tuning) is a challenging task. This task becomes even more complicated when the optimal parameter values change significantly during the run of the algorithm since then a dynamic parameter choice (parameter control) is necessary. In this work, we propose a lazy but effective solution, namely choosing all parameter values (where this makes sense) in each iteration randomly from a suitably scaled power-law distribution. To demonstrate the effectiveness of this approach, we perform runtime analyses of the $(1+(λ,λ))$ genetic algorithm with all three parameters chosen in this manner. We show that this algorithm on the one hand can imitate simple hill-climbers like the $(1+1)$ EA, giving the same asymptotic runtime on problems like OneMax, LeadingOnes, or Minimum Spanning Tree. On the other hand, this algorithm is also very efficient on jump functions, where the best static parameters are very different from those necessary to optimize simple problems. We prove a performance guarantee that is comparable to the best performance known for static parameters. For the most interesting case that the jump size $k$ is constant, we prove that our performance is asymptotically better than what can be obtained with any static parameter choice. We complement our theoretical results with a rigorous empirical study confirming what the asymptotic runtime results suggest. △ Less

Submitted 10 March, 2023; v1 submitted 14 April, 2021; originally announced April 2021.

Comments: Extended version of the paper that appeared at GECCO 2021. To appear in Algorithmica

arXiv:2104.03372 [pdf, ps, other]

doi 10.1145/3449639.3459352

Lower Bounds from Fitness Levels Made Easy

Authors: Benjamin Doerr, Timo Kötzing

Abstract: One of the first and easy to use techniques for proving run time bounds for evolutionary algorithms is the so-called method of fitness levels by Wegener. It uses a partition of the search space into a sequence of levels which are traversed by the algorithm in increasing order, possibly skip** levels. An easy, but often strong upper bound for the run time can then be derived by adding the recipro… ▽ More One of the first and easy to use techniques for proving run time bounds for evolutionary algorithms is the so-called method of fitness levels by Wegener. It uses a partition of the search space into a sequence of levels which are traversed by the algorithm in increasing order, possibly skip** levels. An easy, but often strong upper bound for the run time can then be derived by adding the reciprocals of the probabilities to leave the levels (or upper bounds for these). Unfortunately, a similarly effective method for proving lower bounds has not yet been established. The strongest such method, proposed by Sudholt (2013), requires a careful choice of the viscosity parameters $γ_{i,j}$, $0 \le i < j \le n$. In this paper we present two new variants of the method, one for upper and one for lower bounds. Besides the level leaving probabilities, they only rely on the probabilities that levels are visited at all. We show that these can be computed or estimated without greater difficulties and apply our method to reprove the following known results in an easy and natural way. (i) The precise run time of the (1+1) EA on \textsc{LeadingOnes}. (ii) A lower bound for the run time of the (1+1) EA on \textsc{OneMax}, tight apart from an $O(n)$ term. (iii) A lower bound for the run time of the (1+1) EA on long $k$-paths. We also prove a tighter lower bound for the run time of the (1+1) EA on jump functions by showing that, regardless of the jump size, only with probability $O(2^{-n})$ the algorithm can avoid to jump over the valley of low fitness. △ Less

Submitted 28 April, 2021; v1 submitted 7 April, 2021; originally announced April 2021.

Comments: Extended version of a paper appearing in the proceedings of GECCO 2021

arXiv:2103.15712 [pdf, ps, other]

doi 10.1090/mcom/3727

A Sharp Discrepancy Bound for Jittered Sampling

Authors: Benjamin Doerr

Abstract: For $m, d \in {\mathbb N}$, a jittered sampling point set $P$ having $N = m^d$ points in $[0,1)^d$ is constructed by partitioning the unit cube $[0,1)^d$ into $m^d$ axis-aligned cubes of equal size and then placing one point independently and uniformly at random in each cube. We show that there are constants $c \ge 0$ and $C$ such that for all $d$ and all $m \ge d$ the expected non-normalized star… ▽ More For $m, d \in {\mathbb N}$, a jittered sampling point set $P$ having $N = m^d$ points in $[0,1)^d$ is constructed by partitioning the unit cube $[0,1)^d$ into $m^d$ axis-aligned cubes of equal size and then placing one point independently and uniformly at random in each cube. We show that there are constants $c \ge 0$ and $C$ such that for all $d$ and all $m \ge d$ the expected non-normalized star discrepancy of a jittered sampling point set satisfies \[c \,dm^{\frac{d-1}{2}} \sqrt{1 + \log(\tfrac md)} \le {\mathbb E} D^*(P) \le C\, dm^{\frac{d-1}{2}} \sqrt{1 + \log(\tfrac md)}.\] This discrepancy is thus smaller by a factor of $Θ\big(\sqrt{\frac{1+\log(m/d)}{m/d}}\,\big)$ than the one of a uniformly distributed random point set of $m^d$ points. This result improves both the upper and the lower bound for the discrepancy of jittered sampling given by Pausinger and Steinerberger (Journal of Complexity (2016)). It also removes the asymptotic requirement that $m$ is sufficiently large compared to $d$. △ Less

Submitted 28 December, 2021; v1 submitted 29 March, 2021; originally announced March 2021.

Journal ref: Math. Comp. 91 (2022), 1871-1892

Showing 1–50 of 142 results for author: Dorr, B