Search | arXiv e-print repository

arXiv:2406.06629 [pdf, other]

A Survey of Meta-features Used for Automated Selection of Algorithms for Black-box Single-objective Continuous Optimization

Authors: Gjorg**a Cenikj, Ana Nikolikj, Gašper Petelin, Niki van Stein, Carola Doerr, Tome Eftimov

Abstract: The selection of the most appropriate algorithm to solve a given problem instance, known as algorithm selection, is driven by the potential to capitalize on the complementary performance of different algorithms across sets of problem instances. However, determining the optimal algorithm for an unseen problem instance has been shown to be a challenging task, which has garnered significant attention… ▽ More The selection of the most appropriate algorithm to solve a given problem instance, known as algorithm selection, is driven by the potential to capitalize on the complementary performance of different algorithms across sets of problem instances. However, determining the optimal algorithm for an unseen problem instance has been shown to be a challenging task, which has garnered significant attention from researchers in recent years. In this survey, we conduct an overview of the key contributions to algorithm selection in the field of single-objective continuous black-box optimization. We present ongoing work in representation learning of meta-features for optimization problem instances, algorithm instances, and their interactions. We also study machine learning models for automated algorithm selection, configuration, and performance prediction. Through this analysis, we identify gaps in the state of the art, based on which we present ideas for further development of meta-feature representations. △ Less

Submitted 8 June, 2024; originally announced June 2024.

Comments: 14 pages, 2 figures

MSC Class: 68W50 (Primary) 68T30 (Secondary) ACM Class: F.2.1; I.2.4

arXiv:2405.12259 [pdf, other]

Generalization Ability of Feature-based Performance Prediction Models: A Statistical Analysis across Benchmarks

Authors: Ana Nikolikj, Ana Kostovska, Gjorg**a Cenikj, Carola Doerr, Tome Eftimov

Abstract: This study examines the generalization ability of algorithm performance prediction models across various benchmark suites. Comparing the statistical similarity between the problem collections with the accuracy of performance prediction models that are based on exploratory landscape analysis features, we observe that there is a positive correlation between these two measures. Specifically, when the… ▽ More This study examines the generalization ability of algorithm performance prediction models across various benchmark suites. Comparing the statistical similarity between the problem collections with the accuracy of performance prediction models that are based on exploratory landscape analysis features, we observe that there is a positive correlation between these two measures. Specifically, when the high-dimensional feature value distributions between training and testing suites lack statistical significance, the model tends to generalize well, in the sense that the testing errors are in the same range as the training errors. Two experiments validate these findings: one involving the standard benchmark suites, the BBOB and CEC collections, and another using five collections of affine combinations of BBOB problem instances. △ Less

Submitted 20 May, 2024; originally announced May 2024.

Comments: To appear in the Proc. of the 2024 IEEE World Congress on Computational - Congress on Evolutionary Computation

arXiv:2311.18035 [pdf, other]

TransOpt: Transformer-based Representation Learning for Optimization Problem Classification

Authors: Gjorg**a Cenikj, Gašper Petelin, Tome Eftimov

Abstract: We propose a representation of optimization problem instances using a transformer-based neural network architecture trained for the task of problem classification of the 24 problem classes from the Black-box Optimization Benchmarking (BBOB) benchmark. We show that transformer-based methods can be trained to recognize problem classes with accuracies in the range of 70\%-80\% for different problem d… ▽ More We propose a representation of optimization problem instances using a transformer-based neural network architecture trained for the task of problem classification of the 24 problem classes from the Black-box Optimization Benchmarking (BBOB) benchmark. We show that transformer-based methods can be trained to recognize problem classes with accuracies in the range of 70\%-80\% for different problem dimensions, suggesting the possible application of transformer architectures in acquiring representations for black-box optimization problems. △ Less

Submitted 29 November, 2023; originally announced November 2023.

arXiv:2310.10685 [pdf, other]

PS-AAS: Portfolio Selection for Automated Algorithm Selection in Black-Box Optimization

Authors: Ana Kostovska, Gjorg**a Cenikj, Diederick Vermetten, Anja Jankovic, Ana Nikolikj, Urban Skvorc, Peter Korosec, Carola Doerr, Tome Eftimov

Abstract: The performance of automated algorithm selection (AAS) strongly depends on the portfolio of algorithms to choose from. Selecting the portfolio is a non-trivial task that requires balancing the trade-off between the higher flexibility of large portfolios with the increased complexity of the AAS task. In practice, probably the most common way to choose the algorithms for the portfolio is a greedy se… ▽ More The performance of automated algorithm selection (AAS) strongly depends on the portfolio of algorithms to choose from. Selecting the portfolio is a non-trivial task that requires balancing the trade-off between the higher flexibility of large portfolios with the increased complexity of the AAS task. In practice, probably the most common way to choose the algorithms for the portfolio is a greedy selection of the algorithms that perform well in some reference tasks of interest. We set out in this work to investigate alternative, data-driven portfolio selection techniques. Our proposed method creates algorithm behavior meta-representations, constructs a graph from a set of algorithms based on their meta-representation similarity, and applies a graph algorithm to select a final portfolio of diverse, representative, and non-redundant algorithms. We evaluate two distinct meta-representation techniques (SHAP and performance2vec) for selecting complementary portfolios from a total of 324 different variants of CMA-ES for the task of optimizing the BBOB single-objective problems in dimensionalities 5 and 30 with different cut-off budgets. We test two types of portfolios: one related to overall algorithm behavior and the `personalized' one (related to algorithm behavior per each problem separately). We observe that the approach built on the performance2vec-based representations favors small portfolios with negligible error in the AAS task relative to the virtual best solver from the selected portfolio, whereas the portfolios built from the SHAP-based representations gain from higher flexibility at the cost of decreased performance of the AAS. Across most considered scenarios, personalized portfolios yield comparable or slightly better performance than the classical greedy approach. They outperform the full portfolio in all scenarios. △ Less

Submitted 14 October, 2023; originally announced October 2023.

Comments: Proc. of International Conference on Automated Machine Learning (AutoML 2023)

arXiv:2306.05438 [pdf, other]

doi 10.1145/3583131.3590401

DynamoRep: Trajectory-Based Population Dynamics for Classification of Black-box Optimization Problems

Authors: Gjorg**a Cenikj, Gašper Petelin, Carola Doerr, Peter Korošec, Tome Eftimov

Abstract: The application of machine learning (ML) models to the analysis of optimization algorithms requires the representation of optimization problems using numerical features. These features can be used as input for ML models that are trained to select or to configure a suitable algorithm for the problem at hand. Since in pure black-box optimization information about the problem instance can only be obt… ▽ More The application of machine learning (ML) models to the analysis of optimization algorithms requires the representation of optimization problems using numerical features. These features can be used as input for ML models that are trained to select or to configure a suitable algorithm for the problem at hand. Since in pure black-box optimization information about the problem instance can only be obtained through function evaluation, a common approach is to dedicate some function evaluations for feature extraction, e.g., using random sampling. This approach has two key downsides: (1) It reduces the budget left for the actual optimization phase, and (2) it neglects valuable information that could be obtained from a problem-solver interaction. In this paper, we propose a feature extraction method that describes the trajectories of optimization algorithms using simple descriptive statistics. We evaluate the generated features for the task of classifying problem classes from the Black Box Optimization Benchmarking (BBOB) suite. We demonstrate that the proposed DynamoRep features capture enough information to identify the problem class on which the optimization algorithm is running, achieving a mean classification accuracy of 95% across all experiments. △ Less

Submitted 8 June, 2023; originally announced June 2023.

Comments: 9 pages, 5 figures

arXiv:2306.00040 [pdf, other]

Assessing the Generalizability of a Performance Predictive Model

Authors: Ana Nikolikj, Gjorg**a Cenikj, Gordana Ispirova, Diederick Vermetten, Ryan Dieter Lang, Andries Petrus Engelbrecht, Carola Doerr, Peter Korošec, Tome Eftimov

Abstract: A key component of automated algorithm selection and configuration, which in most cases are performed using supervised machine learning (ML) methods is a good-performing predictive model. The predictive model uses the feature representation of a set of problem instances as input data and predicts the algorithm performance achieved on them. Common machine learning models struggle to make prediction… ▽ More A key component of automated algorithm selection and configuration, which in most cases are performed using supervised machine learning (ML) methods is a good-performing predictive model. The predictive model uses the feature representation of a set of problem instances as input data and predicts the algorithm performance achieved on them. Common machine learning models struggle to make predictions for instances with feature representations not covered by the training data, resulting in poor generalization to unseen problems. In this study, we propose a workflow to estimate the generalizability of a predictive model for algorithm performance, trained on one benchmark suite to another. The workflow has been tested by training predictive models across benchmark suites and the results show that generalizability patterns in the landscape feature space are reflected in the performance space. △ Less

Submitted 31 May, 2023; originally announced June 2023.

Comments: To appear at GECCO 2023

arXiv:2209.04412 [pdf, other]

doi 10.1007/978-3-031-14714-2_2

Improving Nevergrad's Algorithm Selection Wizard NGOpt through Automated Algorithm Configuration

Authors: Risto Trajanov, Ana Nikolikj, Gjorg**a Cenikj, Fabien Teytaud, Mathurin Videau, Olivier Teytaud, Tome Eftimov, Manuel López-Ibáñez, Carola Doerr

Abstract: Algorithm selection wizards are effective and versatile tools that automatically select an optimization algorithm given high-level information about the problem and available computational resources, such as number and type of decision variables, maximal number of evaluations, possibility to parallelize evaluations, etc. State-of-the-art algorithm selection wizards are complex and difficult to imp… ▽ More Algorithm selection wizards are effective and versatile tools that automatically select an optimization algorithm given high-level information about the problem and available computational resources, such as number and type of decision variables, maximal number of evaluations, possibility to parallelize evaluations, etc. State-of-the-art algorithm selection wizards are complex and difficult to improve. We propose in this work the use of automated configuration methods for improving their performance by finding better configurations of the algorithms that compose them. In particular, we use elitist iterated racing (irace) to find CMA configurations for specific artificial benchmarks that replace the hand-crafted CMA configurations currently used in the NGOpt wizard provided by the Nevergrad platform. We discuss in detail the setup of irace for the purpose of generating configurations that work well over the diverse set of problem instances within each benchmark. Our approach improves the performance of the NGOpt wizard, even on benchmark suites that were not part of the tuning by irace. △ Less

Submitted 9 September, 2022; originally announced September 2022.

Comments: Proc. of PPSN 2022

arXiv:2204.11527 [pdf, other]

doi 10.1145/3512290.3528809

SELECTOR: Selecting a Representative Benchmark Suite for Reproducible Statistical Comparison

Authors: Gjorg**a Cenikj, Ryan Dieter Lang, Andries Petrus Engelbrecht, Carola Doerr, Peter Korošec, Tome Eftimov

Abstract: Fair algorithm evaluation is conditioned on the existence of high-quality benchmark datasets that are non-redundant and are representative of typical optimization scenarios. In this paper, we evaluate three heuristics for selecting diverse problem instances which should be involved in the comparison of optimization algorithms in order to ensure robust statistical algorithm performance analysis. Th… ▽ More Fair algorithm evaluation is conditioned on the existence of high-quality benchmark datasets that are non-redundant and are representative of typical optimization scenarios. In this paper, we evaluate three heuristics for selecting diverse problem instances which should be involved in the comparison of optimization algorithms in order to ensure robust statistical algorithm performance analysis. The first approach employs clustering to identify similar groups of problem instances and subsequent sampling from each cluster to construct new benchmarks, while the other two approaches use graph algorithms for identifying dominating and maximal independent sets of nodes. We demonstrate the applicability of the proposed heuristics by performing a statistical performance analysis of five portfolios consisting of three optimization algorithms on five of the most commonly used optimization benchmarks. The results indicate that the statistical analyses of the algorithms' performance, conducted on each benchmark separately, produce conflicting outcomes, which can be used to give a false indication of the superiority of one algorithm over another. On the other hand, when the analysis is conducted on the problem instances selected with the proposed heuristics, which uniformly cover the problem landscape, the statistical outcomes are robust and consistent. △ Less

Submitted 25 April, 2022; originally announced April 2022.

Comments: 10 pages, 6 figures

arXiv:2110.02019 [pdf, other]

FoodChem: A food-chemical relation extraction model

Authors: Gjorg**a Cenikj, Barbara Koroušić Seljak, Tome Eftimov

Abstract: In this paper, we present FoodChem, a new Relation Extraction (RE) model for identifying chemicals present in the composition of food entities, based on textual information provided in biomedical peer-reviewed scientific literature. The RE task is treated as a binary classification problem, aimed at identifying whether the contains relation exists between a food-chemical entity pair. This is accom… ▽ More In this paper, we present FoodChem, a new Relation Extraction (RE) model for identifying chemicals present in the composition of food entities, based on textual information provided in biomedical peer-reviewed scientific literature. The RE task is treated as a binary classification problem, aimed at identifying whether the contains relation exists between a food-chemical entity pair. This is accomplished by fine-tuning BERT, BioBERT and RoBERTa transformer models. For evaluation purposes, a novel dataset with annotated contains relations in food-chemical entity pairs is generated, in a golden and silver version. The models are integrated into a voting scheme in order to produce the silver version of the dataset which we use for augmenting the individual models, while the manually annotated golden version is used for their evaluation. Out of the three evaluated models, the BioBERT model achieves the best results, with a macro averaged F1 score of 0.902 in the unbalanced augmentation setting. △ Less

Submitted 8 October, 2021; v1 submitted 5 October, 2021; originally announced October 2021.

Comments: 8 pages, 3 figures, 2 tables

Showing 1–9 of 9 results for author: Cenikj, G