-
Good practices for Bayesian Optimization of high dimensional structured spaces
Authors:
Eero Siivola,
Javier Gonzalez,
Andrei Paleyes,
Aki Vehtari
Abstract:
The increasing availability of structured but high dimensional data has opened new opportunities for optimization. One emerging and promising avenue is the exploration of unsupervised methods for projecting structured high dimensional data into low dimensional continuous representations, simplifying the optimization problem and enabling the application of traditional optimization methods. However,…
▽ More
The increasing availability of structured but high dimensional data has opened new opportunities for optimization. One emerging and promising avenue is the exploration of unsupervised methods for projecting structured high dimensional data into low dimensional continuous representations, simplifying the optimization problem and enabling the application of traditional optimization methods. However, this line of research has been purely methodological with little connection to the needs of practitioners so far. In this paper, we study the effect of different search space design choices for performing Bayesian Optimization in high dimensional structured datasets. In particular, we analyse the influence of the dimensionality of the latent space, the role of the acquisition function and evaluate new methods to automatically define the optimization bounds in the latent space. Finally, based on experimental results using synthetic and real datasets, we provide recommendations for the practitioners.
△ Less
Submitted 6 January, 2021; v1 submitted 31 December, 2020;
originally announced December 2020.
-
Preferential Batch Bayesian Optimization
Authors:
Eero Siivola,
Akash Kumar Dhaka,
Michael Riis Andersen,
Javier Gonzalez,
Pablo Garcia Moreno,
Aki Vehtari
Abstract:
Most research in Bayesian optimization (BO) has focused on \emph{direct feedback} scenarios, where one has access to exact values of some expensive-to-evaluate objective. This direction has been mainly driven by the use of BO in machine learning hyper-parameter configuration problems. However, in domains such as modelling human preferences, A/B tests, or recommender systems, there is a need for me…
▽ More
Most research in Bayesian optimization (BO) has focused on \emph{direct feedback} scenarios, where one has access to exact values of some expensive-to-evaluate objective. This direction has been mainly driven by the use of BO in machine learning hyper-parameter configuration problems. However, in domains such as modelling human preferences, A/B tests, or recommender systems, there is a need for methods that can replace direct feedback with \emph{preferential feedback}, obtained via rankings or pairwise comparisons. In this work, we present preferential batch Bayesian optimization (PBBO), a new framework that allows finding the optimum of a latent function of interest, given any type of parallel preferential feedback for a group of two or more points. We do so by using a Gaussian process model with a likelihood specially designed to enable parallel and efficient data collection mechanisms, which are key in modern machine learning. We show how the acquisitions developed under this framework generalize and augment previous approaches in Bayesian optimization, expanding the use of these techniques to a wider range of domains. An extensive simulation study shows the benefits of this approach, both with simulated functions and four real data sets.
△ Less
Submitted 31 August, 2021; v1 submitted 25 March, 2020;
originally announced March 2020.
-
Active Learning for Decision-Making from Imbalanced Observational Data
Authors:
Iiris Sundin,
Peter Schulam,
Eero Siivola,
Aki Vehtari,
Suchi Saria,
Samuel Kaski
Abstract:
Machine learning can help personalized decision support by learning models to predict individual treatment effects (ITE). This work studies the reliability of prediction-based decision-making in a task of deciding which action $a$ to take for a target unit after observing its covariates $\tilde{x}$ and predicted outcomes $\hat{p}(\tilde{y} \mid \tilde{x}, a)$. An example case is personalized medic…
▽ More
Machine learning can help personalized decision support by learning models to predict individual treatment effects (ITE). This work studies the reliability of prediction-based decision-making in a task of deciding which action $a$ to take for a target unit after observing its covariates $\tilde{x}$ and predicted outcomes $\hat{p}(\tilde{y} \mid \tilde{x}, a)$. An example case is personalized medicine and the decision of which treatment to give to a patient. A common problem when learning these models from observational data is imbalance, that is, difference in treated/control covariate distributions, which is known to increase the upper bound of the expected ITE estimation error. We propose to assess the decision-making reliability by estimating the ITE model's Type S error rate, which is the probability of the model inferring the sign of the treatment effect wrong. Furthermore, we use the estimated reliability as a criterion for active learning, in order to collect new (possibly expensive) observations, instead of making a forced choice based on unreliable predictions. We demonstrate the effectiveness of this decision-making aware active learning in two decision-making tasks: in simulated data with binary outcomes and in a medical dataset with synthetic and continuous treatment outcomes.
△ Less
Submitted 6 June, 2019; v1 submitted 10 April, 2019;
originally announced April 2019.
-
Requirement verification in simulation-based automation testing
Authors:
Eero Siivola,
Seppo Sierla,
Hannu Niemistö,
Tommi Karhela,
Valeriy Vyatkin
Abstract:
The emergence of the Industrial Internet results in an increasing number of complicated temporal interdependencies between automation systems and the processes to be controlled. There is a need for verification methods that scale better than formal verification methods and which are more exact than testing. Simulation-based runtime verification is proposed as such a method, and an application of M…
▽ More
The emergence of the Industrial Internet results in an increasing number of complicated temporal interdependencies between automation systems and the processes to be controlled. There is a need for verification methods that scale better than formal verification methods and which are more exact than testing. Simulation-based runtime verification is proposed as such a method, and an application of Metric temporal logic is presented as a contribution. The practical scalability of the proposed approach is validated against a production process designed by an industrial partner, resulting in the discovery of requirement violations.
△ Less
Submitted 25 April, 2017; v1 submitted 8 February, 2016;
originally announced February 2016.