Search | arXiv e-print repository

An adaptive approach to Bayesian Optimization with switching costs

Authors: Stefan Pricopie, Richard Allmendinger, Manuel Lopez-Ibanez, Clyde Fare, Matt Benatan, Joshua Knowles

Abstract: We investigate modifications to Bayesian Optimization for a resource-constrained setting of sequential experimental design where changes to certain design variables of the search space incur a switching cost. This models the scenario where there is a trade-off between evaluating more while maintaining the same setup, or switching and restricting the number of possible evaluations due to the incurr… ▽ More We investigate modifications to Bayesian Optimization for a resource-constrained setting of sequential experimental design where changes to certain design variables of the search space incur a switching cost. This models the scenario where there is a trade-off between evaluating more while maintaining the same setup, or switching and restricting the number of possible evaluations due to the incurred cost. We adapt two process-constrained batch algorithms to this sequential problem formulation, and propose two new methods: one cost-aware and one cost-ignorant. We validate and compare the algorithms using a set of 7 scalable test functions in different dimensionalities and switching-cost settings for 30 total configurations. Our proposed cost-aware hyperparameter-free algorithm yields comparable results to tuned process-constrained algorithms in all settings we considered, suggesting some degree of robustness to varying landscape features and cost trade-offs. This method starts to outperform the other algorithms with increasing switching-cost. Our work broadens out from other recent Bayesian Optimization studies in resource-constrained settings that consider a batch setting only. While the contributions of this work are relevant to the general class of resource-constrained problems, they are particularly relevant to problems where adaptability to varying resource availability is of high importance △ Less

Submitted 14 May, 2024; originally announced May 2024.

arXiv:2404.10176 [pdf, other]

Multi-objective evolutionary GAN for tabular data synthesis

Authors: Nian Ran, Bahrul Ilmi Nasution, Claire Little, Richard Allmendinger, Mark Elliot

Abstract: Synthetic data has a key role to play in data sharing by statistical agencies and other generators of statistical data products. Generative Adversarial Networks (GANs), typically applied to image synthesis, are also a promising method for tabular data synthesis. However, there are unique challenges in tabular data compared to images, eg tabular data may contain both continuous and discrete variabl… ▽ More Synthetic data has a key role to play in data sharing by statistical agencies and other generators of statistical data products. Generative Adversarial Networks (GANs), typically applied to image synthesis, are also a promising method for tabular data synthesis. However, there are unique challenges in tabular data compared to images, eg tabular data may contain both continuous and discrete variables and conditional sampling, and, critically, the data should possess high utility and low disclosure risk (the risk of re-identifying a population unit or learning something new about them), providing an opportunity for multi-objective (MO) optimization. Inspired by MO GANs for images, this paper proposes a smart MO evolutionary conditional tabular GAN (SMOE-CTGAN). This approach models conditional synthetic data by applying conditional vectors in training, and uses concepts from MO optimisation to balance disclosure risk against utility. Our results indicate that SMOE-CTGAN is able to discover synthetic datasets with different risk and utility levels for multiple national census datasets. We also find a sweet spot in the early stage of training where a competitive utility and extremely low risk are achieved, by using an Improvement Score. The full code can be downloaded from https://github.com/HuskyNian/SMO\_EGAN\_pytorch. △ Less

Submitted 15 April, 2024; originally announced April 2024.

arXiv:2404.06031 [pdf, other]

FuSeBMC AI: Acceleration of Hybrid Approach through Machine Learning

Authors: Kaled M. Alshmrany, Mohannad Aldughaim, Chenfeng Wei, Tom Sweet, Richard Allmendinger, Lucas C. Cordeiro

Abstract: We present FuSeBMC-AI, a test generation tool grounded in machine learning techniques. FuSeBMC-AI extracts various features from the program and employs support vector machine and neural network models to predict a hybrid approach optimal configuration. FuSeBMC-AI utilizes Bounded Model Checking and Fuzzing as back-end verification engines. FuSeBMC-AI outperforms the default configuration of the u… ▽ More We present FuSeBMC-AI, a test generation tool grounded in machine learning techniques. FuSeBMC-AI extracts various features from the program and employs support vector machine and neural network models to predict a hybrid approach optimal configuration. FuSeBMC-AI utilizes Bounded Model Checking and Fuzzing as back-end verification engines. FuSeBMC-AI outperforms the default configuration of the underlying verification engine in certain cases while concurrently diminishing resource consumption. △ Less

Submitted 9 April, 2024; originally announced April 2024.

arXiv:2310.12842 [pdf, other]

Model-agnostic variable importance for predictive uncertainty: an entropy-based approach

Authors: Danny Wood, Theodore Papamarkou, Matt Benatan, Richard Allmendinger

Abstract: In order to trust the predictions of a machine learning algorithm, it is necessary to understand the factors that contribute to those predictions. In the case of probabilistic and uncertainty-aware models, it is necessary to understand not only the reasons for the predictions themselves, but also the reasons for the model's level of confidence in those predictions. In this paper, we show how exist… ▽ More In order to trust the predictions of a machine learning algorithm, it is necessary to understand the factors that contribute to those predictions. In the case of probabilistic and uncertainty-aware models, it is necessary to understand not only the reasons for the predictions themselves, but also the reasons for the model's level of confidence in those predictions. In this paper, we show how existing methods in explainability can be extended to uncertainty-aware models and how such extensions can be used to understand the sources of uncertainty in a model's predictive distribution. In particular, by adapting permutation feature importance, partial dependence plots, and individual conditional expectation plots, we demonstrate that novel insights into model behaviour may be obtained and that these methods can be used to measure the impact of features on both the entropy of the predictive distribution and the log-likelihood of the ground truth labels under that distribution. With experiments using both synthetic and real-world data, we demonstrate the utility of these approaches to understand both the sources of uncertainty and their impact on model performance. △ Less

Submitted 28 May, 2024; v1 submitted 19 October, 2023; originally announced October 2023.

arXiv:2305.15102 [pdf, other]

Analysis of modular CMA-ES on strict box-constrained problems in the SBOX-COST benchmarking suite

Authors: Diederick Vermetten, Manuel López-Ibáñez, Olaf Mersmann, Richard Allmendinger, Anna V. Kononova

Abstract: Box-constraints limit the domain of decision variables and are common in real-world optimization problems, for example, due to physical, natural or spatial limitations. Consequently, solutions violating a box-constraint may not be evaluable. This assumption is often ignored in the literature, e.g., existing benchmark suites, such as COCO/BBOB, allow the optimizer to evaluate infeasible solutions.… ▽ More Box-constraints limit the domain of decision variables and are common in real-world optimization problems, for example, due to physical, natural or spatial limitations. Consequently, solutions violating a box-constraint may not be evaluable. This assumption is often ignored in the literature, e.g., existing benchmark suites, such as COCO/BBOB, allow the optimizer to evaluate infeasible solutions. This paper presents an initial study on the strict-box-constrained benchmarking suite (SBOX-COST), which is a variant of the well-known BBOB benchmark suite that enforces box-constraints by returning an invalid evaluation value for infeasible solutions. Specifically, we want to understand the performance difference between BBOB and SBOX-COST as a function of two initialization methods and six constraint-handling strategies all tested with modular CMA-ES. We find that, contrary to what may be expected, handling box-constraints by saturation is not always better than not handling them at all. However, across all BBOB functions, saturation is better than not handling, and the difference increases with the number of dimensions. Strictly enforcing box-constraints also has a clear negative effect on the performance of classical CMA-ES (with uniform random initialization and no constraint handling), especially as problem dimensionality increases. △ Less

Submitted 24 May, 2023; originally announced May 2023.

arXiv:2305.11648 [pdf, other]

doi 10.1145/3583133.3596312

Applying Ising Machines to Multi-objective QUBOs

Authors: Mayowa Ayodele, Richard Allmendinger, Manuel López-Ibáñez, Arnaud Liefooghe, Matthieu Parizy

Abstract: Multi-objective optimisation problems involve finding solutions with varying trade-offs between multiple and often conflicting objectives. Ising machines are physical devices that aim to find the absolute or approximate ground states of an Ising model. To apply Ising machines to multi-objective problems, a weighted sum objective function is used to convert multi-objective into single-objective pro… ▽ More Multi-objective optimisation problems involve finding solutions with varying trade-offs between multiple and often conflicting objectives. Ising machines are physical devices that aim to find the absolute or approximate ground states of an Ising model. To apply Ising machines to multi-objective problems, a weighted sum objective function is used to convert multi-objective into single-objective problems. However, deriving scalarisation weights that archives evenly distributed solutions across the Pareto front is not trivial. Previous work has shown that adaptive weights based on dichotomic search, and one based on averages of previously explored weights can explore the Pareto front quicker than uniformly generated weights. However, these adaptive methods have only been applied to bi-objective problems in the past. In this work, we extend the adaptive method based on averages in two ways: (i)~we extend the adaptive method of deriving scalarisation weights for problems with two or more objectives, and (ii)~we use an alternative measure of distance to improve performance. We compare the proposed method with existing ones and show that it leads to the best performance on multi-objective Unconstrained Binary Quadratic Programming (mUBQP) instances with 3 and 4 objectives and that it is competitive with the best one for instances with 2 objectives. △ Less

Submitted 19 May, 2023; originally announced May 2023.

ACM Class: G.1.6

arXiv:2210.11321 [pdf, other]

doi 10.1007/978-3-031-24907-5_47

A Study of Scalarisation Techniques for Multi-Objective QUBO Solving

Authors: Mayowa Ayodele, Richard Allmendinger, Manuel López-Ibáñez, Matthieu Parizy

Abstract: In recent years, there has been significant research interest in solving Quadratic Unconstrained Binary Optimisation (QUBO) problems. Physics-inspired optimisation algorithms have been proposed for deriving optimal or sub-optimal solutions to QUBOs. These methods are particularly attractive within the context of using specialised hardware, such as quantum computers, application specific CMOS and o… ▽ More In recent years, there has been significant research interest in solving Quadratic Unconstrained Binary Optimisation (QUBO) problems. Physics-inspired optimisation algorithms have been proposed for deriving optimal or sub-optimal solutions to QUBOs. These methods are particularly attractive within the context of using specialised hardware, such as quantum computers, application specific CMOS and other high performance computing resources for solving optimisation problems. These solvers are then applied to QUBO formulations of combinatorial optimisation problems. Quantum and quantum-inspired optimisation algorithms have shown promising performance when applied to academic benchmarks as well as real-world problems. However, QUBO solvers are single objective solvers. To make them more efficient at solving problems with multiple objectives, a decision on how to convert such multi-objective problems to single-objective problems need to be made. In this study, we compare methods of deriving scalarisation weights when combining two objectives of the cardinality constrained mean-variance portfolio optimisation problem into one. We show significant performance improvement (measured in terms of hypervolume) when using a method that iteratively fills the largest space in the Pareto front compared to a näive approach using uniformly generated weights. △ Less

Submitted 20 October, 2022; originally announced October 2022.

MSC Class: 90C29; 90C20; 90C27 ACM Class: G.2.1; I.2.8

arXiv:2207.03339 [pdf, ps, other]

Comparing the Utility and Disclosure Risk of Synthetic Data with Samples of Microdata

Authors: Claire Little, Mark Elliot, Richard Allmendinger

Abstract: Most statistical agencies release randomly selected samples of Census microdata, usually with sample fractions under 10% and with other forms of statistical disclosure control (SDC) applied. An alternative to SDC is data synthesis, which has been attracting growing interest, yet there is no clear consensus on how to measure the associated utility and disclosure risk of the data. The ability to pro… ▽ More Most statistical agencies release randomly selected samples of Census microdata, usually with sample fractions under 10% and with other forms of statistical disclosure control (SDC) applied. An alternative to SDC is data synthesis, which has been attracting growing interest, yet there is no clear consensus on how to measure the associated utility and disclosure risk of the data. The ability to produce synthetic Census microdata, where the utility and associated risks are clearly understood, could mean that more timely and wider-ranging access to microdata would be possible. This paper follows on from previous work by the authors which mapped synthetic Census data on a risk-utility (R-U) map. The paper presents a framework to measure the utility and disclosure risk of synthetic data by comparing it to samples of the original data of varying sample fractions, thereby identifying the sample fraction which has equivalent utility and risk to the synthetic data. Three commonly used data synthesis packages are compared with some interesting results. Further work is needed in several directions but the methodology looks very promising. △ Less

Submitted 2 July, 2022; originally announced July 2022.

arXiv:2207.02212 [pdf]

Combining Topic Modeling with Grounded Theory: Case Studies of Project Collaboration

Authors: Eyyub Can Odacioglu, Lihong Zhang, Richard Allmendinger

Abstract: This paper proposes an Artificial Intelligence (AI) Grounded Theory for management studies. We argue that this novel and rigorous approach that embeds topic modelling will lead to the latent knowledge to be found. We illustrate this abductive method using 51 case studies of collaborative innovation published by Project Management Institute (PMI). Initial results are presented and discussed that in… ▽ More This paper proposes an Artificial Intelligence (AI) Grounded Theory for management studies. We argue that this novel and rigorous approach that embeds topic modelling will lead to the latent knowledge to be found. We illustrate this abductive method using 51 case studies of collaborative innovation published by Project Management Institute (PMI). Initial results are presented and discussed that include 40 topics, 6 categories, 4 of which are core categories, and two new theories of project collaboration. △ Less

Submitted 28 June, 2022; originally announced July 2022.

arXiv:2206.13844 [pdf, other]

Cooperative Multi-Agent Search on Endogenously-Changing Fitness Landscapes

Authors: Chin Woei Lim, Richard Allmendinger, Joshua Knowles, Ayesha Alhosani, Mercedes Bleda

Abstract: We use a multi-agent system to model how agents (representing firms) may collaborate and adapt in a business 'landscape' where some, more influential, firms are given the power to shape the landscape of other firms. The landscapes we study are based on the well-known NK model of Kauffman, with the addition of 'shapers', firms that can change the landscape's features for themselves and all other pl… ▽ More We use a multi-agent system to model how agents (representing firms) may collaborate and adapt in a business 'landscape' where some, more influential, firms are given the power to shape the landscape of other firms. The landscapes we study are based on the well-known NK model of Kauffman, with the addition of 'shapers', firms that can change the landscape's features for themselves and all other players. Our work investigates how firms that are additionally endowed with cognitive and experiential search, and the ability to form collaborations with other firms, can use these capabilities to adapt more quickly and adeptly. We find that, in a collaborative group, firms must still have a mind of their own and resist direct mimicry of stronger partners to attain better heights collectively. Larger groups and groups with more influential members generally do better, so targeted intelligent cooperation is beneficial. These conclusions are tentative, and our results show a sensitivity to landscape ruggedness and "malleability" (i.e. the capacity of the landscape to be changed by the shaper firms). Overall, our work demonstrates the potential of computer science, evolution, and machine learning to contribute to business strategy in these complex environments. △ Less

Submitted 28 June, 2022; originally announced June 2022.

arXiv:2206.07834 [pdf, other]

Efficient Approximation of Expected Hypervolume Improvement using Gauss-Hermite Quadrature

Authors: Alma Rahat, Tinkle Chugh, Jonathan Fieldsend, Richard Allmendinger, Kaisa Miettinen

Abstract: Many methods for performing multi-objective optimisation of computationally expensive problems have been proposed recently. Typically, a probabilistic surrogate for each objective is constructed from an initial dataset. The surrogates can then be used to produce predictive densities in the objective space for any solution. Using the predictive densities, we can compute the expected hypervolume imp… ▽ More Many methods for performing multi-objective optimisation of computationally expensive problems have been proposed recently. Typically, a probabilistic surrogate for each objective is constructed from an initial dataset. The surrogates can then be used to produce predictive densities in the objective space for any solution. Using the predictive densities, we can compute the expected hypervolume improvement (EHVI) due to a solution. Maximising the EHVI, we can locate the most promising solution that may be expensively evaluated next. There are closed-form expressions for computing the EHVI, integrating over the multivariate predictive densities. However, they require partitioning the objective space, which can be prohibitively expensive for more than three objectives. Furthermore, there are no closed-form expressions for a problem where the predictive densities are dependent, capturing the correlations between objectives. Monte Carlo approximation is used instead in such cases, which is not cheap. Hence, the need to develop new accurate but cheaper approximation methods remains. Here we investigate an alternative approach toward approximating the EHVI using Gauss-Hermite quadrature. We show that it can be an accurate alternative to Monte Carlo for both independent and correlated predictive densities with statistically significant rank correlations for a range of popular test problems. △ Less

Submitted 15 June, 2022; originally announced June 2022.

arXiv:2205.13399 [pdf, other]

doi 10.1145/3512290.3528698

Multi-objective QUBO Solver: Bi-objective Quadratic Assignment

Authors: Mayowa Ayodele, Richard Allmendinger, Manuel López-Ibáñez, Matthieu Parizy

Abstract: Quantum and quantum-inspired optimisation algorithms are designed to solve problems represented in binary, quadratic and unconstrained form. Combinatorial optimisation problems are therefore often formulated as Quadratic Unconstrained Binary Optimisation Problems (QUBO) to solve them with these algorithms. Moreover, these QUBO solvers are often implemented using specialised hardware to achieve eno… ▽ More Quantum and quantum-inspired optimisation algorithms are designed to solve problems represented in binary, quadratic and unconstrained form. Combinatorial optimisation problems are therefore often formulated as Quadratic Unconstrained Binary Optimisation Problems (QUBO) to solve them with these algorithms. Moreover, these QUBO solvers are often implemented using specialised hardware to achieve enormous speedups, e.g. Fujitsu's Digital Annealer (DA) and D-Wave's Quantum Annealer. However, these are single-objective solvers, while many real-world problems feature multiple conflicting objectives. Thus, a common practice when using these QUBO solvers is to scalarise such multi-objective problems into a sequence of single-objective problems. Due to design trade-offs of these solvers, formulating each scalarisation may require more time than finding a local optimum. We present the first attempt to extend the algorithm supporting a commercial QUBO solver as a multi-objective solver that is not based on scalarisation. The proposed multi-objective DA algorithm is validated on the bi-objective Quadratic Assignment Problem. We observe that algorithm performance significantly depends on the archiving strategy adopted, and that combining DA with non-scalarisation methods to optimise multiple objectives outperforms the current scalarised version of the DA in terms of final solution quality. △ Less

Submitted 26 May, 2022; originally announced May 2022.

Comments: The Genetic and Evolutionary Computation Conference 2022 (GECCO22)

ACM Class: C.1.4; G.2.1

arXiv:2204.01852 [pdf, other]

A Data-Driven Framework for Identifying Investment Opportunities in Private Equity

Authors: Samantha Petersone, Alwin Tan, Richard Allmendinger, Sujit Roy, James Hales

Abstract: The core activity of a Private Equity (PE) firm is to invest into companies in order to provide the investors with profit, usually within 4-7 years. To invest into a company or not is typically done manually by looking at various performance indicators of the company and then making a decision often based on instinct. This process is rather unmanageable given the large number of companies to poten… ▽ More The core activity of a Private Equity (PE) firm is to invest into companies in order to provide the investors with profit, usually within 4-7 years. To invest into a company or not is typically done manually by looking at various performance indicators of the company and then making a decision often based on instinct. This process is rather unmanageable given the large number of companies to potentially invest. Moreover, as more data about company performance indicators becomes available and the number of different indicators one may want to consider increases, manual crawling and assessment of investment opportunities becomes inefficient and ultimately impossible. To address these issues, this paper proposes a framework for automated data-driven screening of investment opportunities and thus the recommendation of businesses to invest in. The framework draws on data from several sources to assess the financial and managerial position of a company, and then uses an explainable artificial intelligence (XAI) engine to suggest investment recommendations. The robustness of the model is validated using different AI algorithms, class imbalance-handling methods, and features extracted from the available data sources. △ Less

Submitted 4 April, 2022; originally announced April 2022.

arXiv:2203.12622 [pdf, other]

doi 10.1145/3512290.3528818

Are Evolutionary Algorithms Safe Optimizers?

Authors: Youngmin Kim, Richard Allmendinger, Manuel López-Ibáñez

Abstract: We consider a type of constrained optimization problem, where the violation of a constraint leads to an irrevocable loss, such as breakage of a valuable experimental resource/platform or loss of human life. Such problems are referred to as safe optimization problems (SafeOPs). While SafeOPs have received attention in the machine learning community in recent years, there was little interest in the… ▽ More We consider a type of constrained optimization problem, where the violation of a constraint leads to an irrevocable loss, such as breakage of a valuable experimental resource/platform or loss of human life. Such problems are referred to as safe optimization problems (SafeOPs). While SafeOPs have received attention in the machine learning community in recent years, there was little interest in the evolutionary computation (EC) community despite some early attempts between 2009 and 2011. Moreover, there is a lack of acceptable guidelines on how to benchmark different algorithms for SafeOPs, an area where the EC community has significant experience in. Driven by the need for more efficient algorithms and benchmark guidelines for SafeOPs, the objective of this paper is to reignite the interest of this problem class in the EC community. To achieve this we (i) provide a formal definition of SafeOPs and contrast it to other types of optimization problems that the EC community is familiar with, (ii) investigate the impact of key SafeOP parameters on the performance of selected safe optimization algorithms, (iii) benchmark EC against state-of-the-art safe optimization algorithms from the machine learning community, and (iv) provide an open-source Python framework to replicate and extend our work. △ Less

Submitted 6 May, 2022; v1 submitted 24 March, 2022; originally announced March 2022.

Comments: To be published in proceedings of Genetic and Evolutionary Computation Conference (GECCO 22), July 9-13, 2022, Boston, MA, USA. ACM, New York, NY, USA, 9 pages. https://doi.org/10.1145/3512290.3528818

ACM Class: I.2.8

arXiv:2202.12187 [pdf, ps, other]

SonOpt: Sonifying Bi-objective Population-Based Optimization Algorithms

Authors: Tasos Asonitis, Richard Allmendinger, Matt Benatan, Ricardo Climent

Abstract: We propose SonOpt, the first (open source) data sonification application for monitoring the progress of bi-objective population-based optimization algorithms during search, to facilitate algorithm understanding. SonOpt provides insights into convergence/stagnation of search, the evolution of the approximation set shape, location of recurring points in the approximation set, and population diversit… ▽ More We propose SonOpt, the first (open source) data sonification application for monitoring the progress of bi-objective population-based optimization algorithms during search, to facilitate algorithm understanding. SonOpt provides insights into convergence/stagnation of search, the evolution of the approximation set shape, location of recurring points in the approximation set, and population diversity. The benefits of data sonification have been shown for various non-optimization related monitoring tasks. However, very few attempts have been made in the context of optimization and their focus has been exclusively on single-objective problems. In comparison, SonOpt is designed for bi-objective optimization problems, relies on objective function values of non-dominated solutions only, and is designed with the user (listener) in mind; avoiding convolution of multiple sounds and prioritising ease of familiarizing with the system. This is achieved using two sonification paths relying on the concepts of wavetable and additive synthesis. This paper motivates and describes the architecture of SonOpt, and then validates SonOpt for two popular multi-objective optimization algorithms (NSGA-II and MOEA/D). Experience SonOpt yourself via https://github.com/tasos-a/SonOpt-1.0 . △ Less

Submitted 24 February, 2022; originally announced February 2022.

arXiv:2112.01925 [pdf, other]

Generative Adversarial Networks for Synthetic Data Generation: A Comparative Study

Authors: Claire Little, Mark Elliot, Richard Allmendinger, Sahel Shariati Samani

Abstract: Generative Adversarial Networks (GANs) are gaining increasing attention as a means for synthesising data. So far much of this work has been applied to use cases outside of the data confidentiality domain with a common application being the production of artificial images. Here we consider the potential application of GANs for the purpose of generating synthetic census microdata. We employ a batter… ▽ More Generative Adversarial Networks (GANs) are gaining increasing attention as a means for synthesising data. So far much of this work has been applied to use cases outside of the data confidentiality domain with a common application being the production of artificial images. Here we consider the potential application of GANs for the purpose of generating synthetic census microdata. We employ a battery of utility metrics and a disclosure risk metric (the Targeted Correct Attribution Probability) to compare the data produced by tabular GANs with those produced using orthodox data synthesis methods. △ Less

Submitted 3 December, 2021; originally announced December 2021.

arXiv:2107.00531 [pdf, ps, other]

Towards a fairer reimbursement system for burn patients using cost-sensitive classification

Authors: Chimdimma Noelyn Onah, Richard Allmendinger, Julia Handl, Ken W. Dunn

Abstract: The adoption of the Prospective Payment System (PPS) in the UK National Health Service (NHS) has led to the creation of patient groups called Health Resource Groups (HRG). HRGs aim to identify groups of clinically similar patients that share similar resource usage for reimbursement purposes. These groups are predominantly identified based on expert advice, with homogeneity checked using the length… ▽ More The adoption of the Prospective Payment System (PPS) in the UK National Health Service (NHS) has led to the creation of patient groups called Health Resource Groups (HRG). HRGs aim to identify groups of clinically similar patients that share similar resource usage for reimbursement purposes. These groups are predominantly identified based on expert advice, with homogeneity checked using the length of stay (LOS). However, for complex patients such as those encountered in burn care, LOS is not a perfect proxy of resource usage, leading to incomplete homogeneity checks. To improve homogeneity in resource usage and severity, we propose a data-driven model and the inclusion of patient-level costing. We investigate whether a data-driven approach that considers additional measures of resource usage can lead to a more comprehensive model. In particular, a cost-sensitive decision tree model is adopted to identify features of importance and rules that allow for a focused segmentation on resource usage (LOS and patient-level cost) and clinical similarity (severity of burn). The proposed approach identified groups with increased homogeneity compared to the current HRG groups, allowing for a more equitable reimbursement of hospital care costs if adopted. △ Less

Submitted 1 July, 2021; originally announced July 2021.

Comments: Joint KDD 2021 Health Day and 2021 KDD Workshop on Applied Data Science for Healthcare: State of XAI and trustworthiness in Health

arXiv:2106.03275 [pdf, other]

What if we Increase the Number of Objectives? Theoretical and Empirical Implications for Many-objective Optimization

Authors: Richard Allmendinger, Andrzej Jaszkiewicz, Arnaud Liefooghe, Christiane Tammer

Abstract: The difficulty of solving a multi-objective optimization problem is impacted by the number of objectives to be optimized. The presence of many objectives typically introduces a number of challenges that affect the choice/design of optimization algorithms. This paper investigates the drivers of these challenges from two angles: (i) the influence of the number of objectives on problem characteristic… ▽ More The difficulty of solving a multi-objective optimization problem is impacted by the number of objectives to be optimized. The presence of many objectives typically introduces a number of challenges that affect the choice/design of optimization algorithms. This paper investigates the drivers of these challenges from two angles: (i) the influence of the number of objectives on problem characteristics and (ii) the practical behavior of commonly used procedures and algorithms for co** with many objectives. In addition to reviewing various drivers, the paper makes theoretical contributions by quantifying some drivers and/or verifying these drivers empirically by carrying out experiments on multi-objective NK landscapes and other typical benchmarks. We then make use of our theoretical and empirical findings to derive practical recommendations to support algorithm design. Finally, we discuss remaining theoretical gaps and opportunities for future research in the area of multi- and many-objective optimization. △ Less

Submitted 6 June, 2021; originally announced June 2021.

arXiv:2103.15546 [pdf, other]

Heterogeneous Objectives: State-of-the-Art and Future Research

Authors: Richard Allmendinger, Joshua Knowles

Abstract: Multiobjective optimization problems with heterogeneous objectives are defined as those that possess significantly different types of objective function components (not just incommensurable in units or scale). For example, in a heterogeneous problem the objective function components may differ in formal computational complexity, practical evaluation effort (time, costs, or resources), determinism… ▽ More Multiobjective optimization problems with heterogeneous objectives are defined as those that possess significantly different types of objective function components (not just incommensurable in units or scale). For example, in a heterogeneous problem the objective function components may differ in formal computational complexity, practical evaluation effort (time, costs, or resources), determinism (stochastic vs deterministic), or some combination of all three. A particularly challenging variety of heterogeneity may occur by the combination of a time-consuming laboratory-based objective with other objectives that are evaluated using faster computer-based calculations. Perhaps more commonly, all objectives may be evaluated computationally, but some may require a lengthy simulation process while others are computed from a relatively simple closed-form calculation. In this chapter, we motivate the need for more work on the topic of heterogeneous objectives (with reference to real-world examples), expand on a basic taxonomy of heterogeneity types, and review the state of the art in tackling these problems. We give special attention to heterogeneity in evaluation time (latency) as this requires sophisticated approaches. We also present original experimental work on estimating the amount of heterogeneity in evaluation time expected in many-objective problems, given reasonable assumptions, and survey related research threads that could contribute to this area in future. △ Less

Submitted 26 February, 2021; originally announced March 2021.

Comments: 20 pages, submitted for consideration to the MACODA book project

arXiv:2102.06940 [pdf, other]

doi 10.1109/TEVC.2021.3137369

HAWKS: Evolving Challenging Benchmark Sets for Cluster Analysis

Authors: Cameron Shand, Richard Allmendinger, Julia Handl, Andrew Webb, John Keane

Abstract: Comprehensive benchmarking of clustering algorithms is rendered difficult by two key factors: (i)~the elusiveness of a unique mathematical definition of this unsupervised learning approach and (ii)~dependencies between the generating models or clustering criteria adopted by some clustering algorithms and indices for internal cluster validation. Consequently, there is no consensus regarding the bes… ▽ More Comprehensive benchmarking of clustering algorithms is rendered difficult by two key factors: (i)~the elusiveness of a unique mathematical definition of this unsupervised learning approach and (ii)~dependencies between the generating models or clustering criteria adopted by some clustering algorithms and indices for internal cluster validation. Consequently, there is no consensus regarding the best practice for rigorous benchmarking, and whether this is possible at all outside the context of a given application. Here, we argue that synthetic datasets must continue to play an important role in the evaluation of clustering algorithms, but that this necessitates constructing benchmarks that appropriately cover the diverse set of properties that impact clustering algorithm performance. Through our framework, HAWKS, we demonstrate the important role evolutionary algorithms play to support flexible generation of such benchmarks, allowing simple modification and extension. We illustrate two possible uses of our framework: (i)~the evolution of benchmark data consistent with a set of hand-derived properties and (ii)~the generation of datasets that tease out performance differences between a given pair of algorithms. Our work has implications for the design of clustering benchmarks that sufficiently challenge a broad range of algorithms, and for furthering insight into the strengths and weaknesses of specific approaches. △ Less

Submitted 10 January, 2022; v1 submitted 13 February, 2021; originally announced February 2021.

Comments: Accepted version of the paper accepted to IEEE Transactions on Evolutionary Computation. 15 pages + 11 pages supplementary material

arXiv:2101.09505 [pdf, ps, other]

doi 10.1007/978-3-030-73959-1_12

Safe Learning and Optimization Techniques: Towards a Survey of the State of the Art

Authors: Youngmin Kim, Richard Allmendinger, Manuel López-Ibáñez

Abstract: Safe learning and optimization deals with learning and optimization problems that avoid, as much as possible, the evaluation of non-safe input points, which are solutions, policies, or strategies that cause an irrecoverable loss (e.g., breakage of a machine or equipment, or life threat). Although a comprehensive survey of safe reinforcement learning algorithms was published in 2015, a number of ne… ▽ More Safe learning and optimization deals with learning and optimization problems that avoid, as much as possible, the evaluation of non-safe input points, which are solutions, policies, or strategies that cause an irrecoverable loss (e.g., breakage of a machine or equipment, or life threat). Although a comprehensive survey of safe reinforcement learning algorithms was published in 2015, a number of new algorithms have been proposed thereafter, and related works in active learning and in optimization were not considered. This paper reviews those algorithms from a number of domains including reinforcement learning, Gaussian process regression and classification, evolutionary algorithms, and active learning. We provide the fundamental concepts on which the reviewed algorithms are based and a characterization of the individual algorithms. We conclude by explaining how the algorithms are connected and suggestions for future research. △ Less

Submitted 23 June, 2021; v1 submitted 23 January, 2021; originally announced January 2021.

Comments: The final authenticated publication was made In: Heintz F., Milano M., O'Sullivan B. (eds) Trustworthy AI - Integrating Learning, Optimization and Reasoning. TAILOR 2020. Lecture Notes in Computer Science, vol 12641. Springer, Cham. The final authenticated publication is available online at \<https://doi.org/10.1007/978-3-030-73959-1_12>

ACM Class: I.2.8

Showing 1–21 of 21 results for author: Allmendinger, R