-
Scalarisation-based risk concepts for robust multi-objective optimisation
Authors:
Ben Tu,
Nikolas Kantas,
Robert M. Lee,
Behrang Shafei
Abstract:
Robust optimisation is a well-established framework for optimising functions in the presence of uncertainty. The inherent goal of this problem is to identify a collection of inputs whose outputs are both desirable for the decision maker, whilst also being robust to the underlying uncertainties in the problem. In this work, we study the multi-objective extension of this problem from a computational…
▽ More
Robust optimisation is a well-established framework for optimising functions in the presence of uncertainty. The inherent goal of this problem is to identify a collection of inputs whose outputs are both desirable for the decision maker, whilst also being robust to the underlying uncertainties in the problem. In this work, we study the multi-objective extension of this problem from a computational standpoint. We identify that the majority of all robust multi-objective algorithms rely on two key operations: robustification and scalarisation. Robustification refers to the strategy that is used to marginalise over the uncertainty in the problem. Whilst scalarisation refers to the procedure that is used to encode the relative importance of each objective. As these operations are not necessarily commutative, the order that they are performed in has an impact on the resulting solutions that are identified and the final decisions that are made. This work aims to give an exposition on the philosophical differences between these two operations and highlight when one should opt for one ordering over the other. As part of our analysis, we showcase how many existing risk concepts can be easily integrated into the specification and solution of a robust multi-objective optimisation problem. Besides this, we also demonstrate how one can principally define the notion of a robust Pareto front and a robust performance metric based on our robustify and scalarise methodology. To illustrate the efficacy of these new ideas, we present two insightful numerical case studies which are based on real-world data sets.
△ Less
Submitted 16 May, 2024;
originally announced May 2024.
-
Random Pareto front surfaces
Authors:
Ben Tu,
Nikolas Kantas,
Robert M. Lee,
Behrang Shafei
Abstract:
The goal of multi-objective optimisation is to identify the Pareto front surface which is the set obtained by connecting the best trade-off points. Typically this surface is computed by evaluating the objectives at different points and then interpolating between the subset of the best evaluated trade-off points. In this work, we propose to parameterise the Pareto front surface using polar coordina…
▽ More
The goal of multi-objective optimisation is to identify the Pareto front surface which is the set obtained by connecting the best trade-off points. Typically this surface is computed by evaluating the objectives at different points and then interpolating between the subset of the best evaluated trade-off points. In this work, we propose to parameterise the Pareto front surface using polar coordinates. More precisely, we show that any Pareto front surface can be equivalently represented using a scalar-valued length function which returns the projected length along any positive radial direction. We then use this representation in order to rigorously develop the theory and applications of stochastic Pareto front surfaces. In particular, we derive many Pareto front surface statistics of interest such as the expectation, covariance and quantiles. We then discuss how these can be used in practice within a design of experiments setting, where the goal is to both infer and use the Pareto front surface distribution in order to make effective decisions. Our framework allows for clear uncertainty quantification and we also develop advanced visualisation techniques for this purpose. Finally we discuss the applicability of our ideas within multivariate extreme value theory and illustrate our methodology in a variety of numerical examples, including a case study with a real-world air pollution data set.
△ Less
Submitted 21 June, 2024; v1 submitted 2 May, 2024;
originally announced May 2024.
-
Multi-objective optimisation via the R2 utilities
Authors:
Ben Tu,
Nikolas Kantas,
Robert M. Lee,
Behrang Shafei
Abstract:
The goal of multi-objective optimisation is to identify a collection of points which describe the best possible trade-offs between the multiple objectives. In order to solve this vector-valued optimisation problem, practitioners often appeal to the use of scalarisation functions in order to transform the multi-objective problem into a collection of single-objective problems. This set of scalarised…
▽ More
The goal of multi-objective optimisation is to identify a collection of points which describe the best possible trade-offs between the multiple objectives. In order to solve this vector-valued optimisation problem, practitioners often appeal to the use of scalarisation functions in order to transform the multi-objective problem into a collection of single-objective problems. This set of scalarised problems can then be solved using traditional single-objective optimisation techniques. In this work, we formalise this convention into a general mathematical framework. We show how this strategy effectively recasts the original multi-objective optimisation problem into a single-objective optimisation problem defined over sets. An appropriate class of objective functions for this new problem are the R2 utilities, which are utility functions that are defined as a weighted integral over the scalarised optimisation problems. As part of our work, we show that these utilities are monotone and submodular set functions which can be optimised effectively using greedy optimisation algorithms. We then analyse the performance of these greedy algorithms both theoretically and empirically. Our analysis largely focusses on Bayesian optimisation, which is a popular probabilistic framework for black-box optimisation.
△ Less
Submitted 1 May, 2024; v1 submitted 19 May, 2023;
originally announced May 2023.
-
Privacy Risk for anisotropic Langevin dynamics using relative entropy bounds
Authors:
Anastasia Borovykh,
Nikolas Kantas,
Panos Parpas,
Greg Pavliotis
Abstract:
The privacy preserving properties of Langevin dynamics with additive isotropic noise have been extensively studied. However, the isotropic noise assumption is very restrictive: (a) when adding noise to existing learning algorithms to preserve privacy and maintain the best possible accuracy one should take into account the relative magnitude of the outputs and their correlations; (b) popular algori…
▽ More
The privacy preserving properties of Langevin dynamics with additive isotropic noise have been extensively studied. However, the isotropic noise assumption is very restrictive: (a) when adding noise to existing learning algorithms to preserve privacy and maintain the best possible accuracy one should take into account the relative magnitude of the outputs and their correlations; (b) popular algorithms such as stochastic gradient descent (and their continuous time limits) appear to possess anisotropic covariance properties. To study the privacy risks for the anisotropic noise case, one requires general results on the relative entropy between the laws of two Stochastic Differential Equations with different drifts and diffusion coefficients. Our main contribution is to establish such a bound using stability estimates for solutions to the Fokker-Planck equations via functional inequalities. With additional assumptions, the relative entropy bound implies an $(ε,δ)$-differential privacy bound or translates to bounds on the membership inference attack success and we show how anisotropic noise can lead to better privacy-accuracy trade-offs. Finally, the benefits of anisotropic noise are illustrated using numerical results in quadratic loss and neural network setups.
△ Less
Submitted 11 July, 2023; v1 submitted 1 February, 2023;
originally announced February 2023.
-
Joint Entropy Search for Multi-objective Bayesian Optimization
Authors:
Ben Tu,
Axel Gandy,
Nikolas Kantas,
Behrang Shafei
Abstract:
Many real-world problems can be phrased as a multi-objective optimization problem, where the goal is to identify the best set of compromises between the competing objectives. Multi-objective Bayesian optimization (BO) is a sample efficient strategy that can be deployed to solve these vector-valued optimization problems where access is limited to a number of noisy objective function evaluations. In…
▽ More
Many real-world problems can be phrased as a multi-objective optimization problem, where the goal is to identify the best set of compromises between the competing objectives. Multi-objective Bayesian optimization (BO) is a sample efficient strategy that can be deployed to solve these vector-valued optimization problems where access is limited to a number of noisy objective function evaluations. In this paper, we propose a novel information-theoretic acquisition function for BO called Joint Entropy Search (JES), which considers the joint information gain for the optimal set of inputs and outputs. We present several analytical approximations to the JES acquisition function and also introduce an extension to the batch setting. We showcase the effectiveness of this new approach on a range of synthetic and real-world problems in terms of the hypervolume and its weighted variants.
△ Less
Submitted 6 October, 2022;
originally announced October 2022.
-
The sharp, the flat and the shallow: Can weakly interacting agents learn to escape bad minima?
Authors:
Nikolas Kantas,
Panos Parpas,
Grigorios A. Pavliotis
Abstract:
An open problem in machine learning is whether flat minima generalize better and how to compute such minima efficiently. This is a very challenging problem. As a first step towards understanding this question we formalize it as an optimization problem with weakly interacting agents. We review appropriate background material from the theory of stochastic processes and provide insights that are rele…
▽ More
An open problem in machine learning is whether flat minima generalize better and how to compute such minima efficiently. This is a very challenging problem. As a first step towards understanding this question we formalize it as an optimization problem with weakly interacting agents. We review appropriate background material from the theory of stochastic processes and provide insights that are relevant to practitioners. We propose an algorithmic framework for an extended stochastic gradient Langevin dynamics and illustrate its potential. The paper is written as a tutorial, and presents an alternative use of multi-agent learning. Our primary focus is on the design of algorithms for machine learning applications; however the underlying mathematical framework is suitable for the understanding of large scale systems of agent based models that are popular in the social sciences, economics and finance.
△ Less
Submitted 10 May, 2019;
originally announced May 2019.
-
On Adaptive Estimation for Dynamic Bernoulli Bandits
Authors:
Xue Lu,
Niall Adams,
Nikolas Kantas
Abstract:
The multi-armed bandit (MAB) problem is a classic example of the exploration-exploitation dilemma. It is concerned with maximising the total rewards for a gambler by sequentially pulling an arm from a multi-armed slot machine where each arm is associated with a reward distribution. In static MABs, the reward distributions do not change over time, while in dynamic MABs, each arm's reward distributi…
▽ More
The multi-armed bandit (MAB) problem is a classic example of the exploration-exploitation dilemma. It is concerned with maximising the total rewards for a gambler by sequentially pulling an arm from a multi-armed slot machine where each arm is associated with a reward distribution. In static MABs, the reward distributions do not change over time, while in dynamic MABs, each arm's reward distribution can change, and the optimal arm can switch over time. Motivated by many real applications where rewards are binary, we focus on dynamic Bernoulli bandits. Standard methods like $ε$-Greedy and Upper Confidence Bound (UCB), which rely on the sample mean estimator, often fail to track changes in the underlying reward for dynamic problems. In this paper, we overcome the shortcoming of slow response to change by deploying adaptive estimation in the standard methods and propose a new family of algorithms, which are adaptive versions of $ε$-Greedy, UCB, and Thompson sampling. These new methods are simple and easy to implement. Moreover, they do not require any prior knowledge about the dynamic reward process, which is important for real applications. We examine the new algorithms numerically in different scenarios and the results show solid improvements of our algorithms in dynamic environments.
△ Less
Submitted 14 May, 2018; v1 submitted 8 December, 2017;
originally announced December 2017.
-
Distributed Maximum Likelihood for Simultaneous Self-localization and Tracking in Sensor Networks
Authors:
Nikolas Kantas,
Sumeetpal S. Singh,
Arnaud Doucet
Abstract:
We show that the sensor self-localization problem can be cast as a static parameter estimation problem for Hidden Markov Models and we implement fully decentralized versions of the Recursive Maximum Likelihood and on-line Expectation-Maximization algorithms to localize the sensor network simultaneously with target tracking. For linear Gaussian models, our algorithms can be implemented exactly usin…
▽ More
We show that the sensor self-localization problem can be cast as a static parameter estimation problem for Hidden Markov Models and we implement fully decentralized versions of the Recursive Maximum Likelihood and on-line Expectation-Maximization algorithms to localize the sensor network simultaneously with target tracking. For linear Gaussian models, our algorithms can be implemented exactly using a distributed version of the Kalman filter and a novel message passing algorithm. The latter allows each node to compute the local derivatives of the likelihood or the sufficient statistics needed for Expectation-Maximization. In the non-linear case, a solution based on local linearization in the spirit of the Extended Kalman Filter is proposed. In numerical examples we demonstrate that the developed algorithms are able to learn the localization parameters.
△ Less
Submitted 19 June, 2012;
originally announced June 2012.