-
The Algorithm Configuration Problem
Authors:
Gabriele Iommazzo,
Claudia D'Ambrosio,
Antonio Frangioni,
Leo Liberti
Abstract:
The field of algorithmic optimization has significantly advanced with the development of methods for the automatic configuration of algorithmic parameters. This article delves into the Algorithm Configuration Problem, focused on optimizing parametrized algorithms for solving specific instances of decision/optimization problems. We present a comprehensive framework that not only formalizes the Algo…
▽ More
The field of algorithmic optimization has significantly advanced with the development of methods for the automatic configuration of algorithmic parameters. This article delves into the Algorithm Configuration Problem, focused on optimizing parametrized algorithms for solving specific instances of decision/optimization problems. We present a comprehensive framework that not only formalizes the Algorithm Configuration Problem, but also outlines different approaches for its resolution, leveraging machine learning models and heuristic strategies. The article categorizes existing methodologies into per-instance and per-problem approaches, distinguishing between offline and online strategies for model construction and deployment. By synthesizing these approaches, we aim to provide a clear pathway for both understanding and addressing the complexities inherent in algorithm configuration.
△ Less
Submitted 1 March, 2024;
originally announced March 2024.
-
Learning to Configure Mathematical Programming Solvers by Mathematical Programming
Authors:
Gabriele Iommazzo,
Claudia D'Ambrosio,
Antonio Frangioni,
Leo Liberti
Abstract:
We discuss the issue of finding a good mathematical programming solver configuration for a particular instance of a given problem, and we propose a two-phase approach to solve it. In the first phase we learn the relationships between the instance, the configuration and the performance of the configured solver on the given instance. A specific difficulty of learning a good solver configuration is t…
▽ More
We discuss the issue of finding a good mathematical programming solver configuration for a particular instance of a given problem, and we propose a two-phase approach to solve it. In the first phase we learn the relationships between the instance, the configuration and the performance of the configured solver on the given instance. A specific difficulty of learning a good solver configuration is that parameter settings may not all be independent; this requires enforcing (hard) constraints, something that many widely used supervised learning methods cannot natively achieve. We tackle this issue in the second phase of our approach, where we use the learnt information to construct and solve an optimization problem having an explicit representation of the dependency/consistency constraints on the configuration parameter settings. We discuss computational results for two different instantiations of this approach on a unit commitment problem arising in the short-term planning of hydro valleys. We use logistic regression as the supervised learning methodology and consider CPLEX as the solver of interest.
△ Less
Submitted 10 January, 2024;
originally announced January 2024.
-
A learning-based mathematical programming formulation for the automatic configuration of optimization solvers
Authors:
Gabriele Iommazzo,
Claudia D'Ambrosio,
Antonio Frangioni,
Leo Liberti
Abstract:
We propose a methodology, based on machine learning and optimization, for selecting a solver configuration for a given instance. First, we employ a set of solved instances and configurations in order to learn a performance function of the solver. Secondly, we formulate a mixed-integer nonlinear program where the objective/constraints explicitly encode the learnt information, and which we solve, up…
▽ More
We propose a methodology, based on machine learning and optimization, for selecting a solver configuration for a given instance. First, we employ a set of solved instances and configurations in order to learn a performance function of the solver. Secondly, we formulate a mixed-integer nonlinear program where the objective/constraints explicitly encode the learnt information, and which we solve, upon the arrival of an unknown instance, to find the best solver configuration for that instance, based on the performance function. The main novelty of our approach lies in the fact that the configuration set search problem is formulated as a mathematical program, which allows us to a) enforce hard dependence and compatibility constraints on the configurations, and b) solve it efficiently with off-the-shelf optimization tools.
△ Less
Submitted 8 January, 2024;
originally announced January 2024.
-
An impossible utopia in distance geometry
Authors:
Germano Abud,
Jorge Alencar,
Carlile Lavor,
Leo Liberti,
Antonio Mucherino
Abstract:
The Distance Geometry Problem asks for a realization of a given weighted graph in $\mathbb{R}^K$. Two variants of this problem, both originating from protein conformation, are based on a given vertex order (which abstracts the protein backbone). Both variants involve an element of discrete decision in the realization of the next vertex in the order using $K$ preceding (already realized) vertices.…
▽ More
The Distance Geometry Problem asks for a realization of a given weighted graph in $\mathbb{R}^K$. Two variants of this problem, both originating from protein conformation, are based on a given vertex order (which abstracts the protein backbone). Both variants involve an element of discrete decision in the realization of the next vertex in the order using $K$ preceding (already realized) vertices. The difference between these variants is that one requires the $K$ preceding vertices to be contiguous. The presence of this constraint allows one to prove, via a combinatorial counting of the number of solutions, that the realization algorithm is fixed-parameter tractable. Its absence, on the other hand, makes it possible to efficiently construct the vertex order directly from the graph. Deriving a combinatorial counting method without using the contiguity requirement would therefore be desirable. In this paper we prove that, unfortunately, such a counting method cannot be devised in general.
△ Less
Submitted 4 October, 2021;
originally announced October 2021.
-
MIP and Set Covering approaches for Sparse Approximation
Authors:
Diego Delle Donne,
Matthieu Kowalski,
Leo Liberti
Abstract:
The Sparse Approximation problem asks to find a solution $x$ such that $||y - Hx|| < α$, for a given norm $||\cdot||$, minimizing the size of the support $||x||_0 := \#\{j \ |\ x_j \neq 0 \}$. We present valid inequalities for Mixed Integer Programming (MIP) formulations for this problem and we show that these families are sufficient to describe the set of feasible supports. This leads to a reform…
▽ More
The Sparse Approximation problem asks to find a solution $x$ such that $||y - Hx|| < α$, for a given norm $||\cdot||$, minimizing the size of the support $||x||_0 := \#\{j \ |\ x_j \neq 0 \}$. We present valid inequalities for Mixed Integer Programming (MIP) formulations for this problem and we show that these families are sufficient to describe the set of feasible supports. This leads to a reformulation of the problem as an Integer Programming (IP) model which in turn represents a Minimum Set Covering formulation, thus yielding many families of valid inequalities which may be used to strengthen the models up. We propose algorithms to solve sparse approximation problems including a branch \& cut for the MIP, a two-stages algorithm to tackle the set covering IP and a heuristic approach based on Local Branching type constraints. These methods are compared in a computational experimentation with the goal of testing their practical potential.
△ Less
Submitted 14 September, 2020;
originally announced September 2020.
-
A new algorithm for the $^K$DMDGP subclass of Distance Geometry Problems
Authors:
Douglas S. Goncalves,
Carlile Lavor,
Leo Liberti,
Michael Souza
Abstract:
The fundamental inverse problem in distance geometry is the one of finding positions from inter-point distances. The Discretizable Molecular Distance Geometry Problem (DMDGP) is a subclass of the Distance Geometry Problem (DGP) whose search space can be discretized and represented by a binary tree, which can be explored by a Branch-and-Prune (BP) algorithm. It turns out that this combinatorial sea…
▽ More
The fundamental inverse problem in distance geometry is the one of finding positions from inter-point distances. The Discretizable Molecular Distance Geometry Problem (DMDGP) is a subclass of the Distance Geometry Problem (DGP) whose search space can be discretized and represented by a binary tree, which can be explored by a Branch-and-Prune (BP) algorithm. It turns out that this combinatorial search space possesses many interesting symmetry properties that were studied in the last decade. In this paper, we present a new algorithm for this subclass of the DGP, which exploits DMDGP symmetries more effectively than its predecessors. Computational results show that the speedup, with respect to the classic BP algorithm, is considerable for sparse DMDGP instances related to protein conformation.
△ Less
Submitted 11 September, 2020;
originally announced September 2020.
-
Cycle-based formulations in Distance Geometry
Authors:
Leo Liberti,
Gabriele Iommazzo,
Carlile Lavor,
Nelson Maculan
Abstract:
The distance geometry problem asks to find a realization of a given simple edge-weighted graph in a Euclidean space of given dimension K, where the edges are realized as straight segments of lengths equal (or as close as possible) to the edge weights. The problem is often modelled as a mathematical programming formulation involving decision variables that determine the position of the vertices in…
▽ More
The distance geometry problem asks to find a realization of a given simple edge-weighted graph in a Euclidean space of given dimension K, where the edges are realized as straight segments of lengths equal (or as close as possible) to the edge weights. The problem is often modelled as a mathematical programming formulation involving decision variables that determine the position of the vertices in the given Euclidean space. Solution algorithms are generally constructed using local or global nonlinear optimization techniques. We present a new modelling technique for this problem where, instead of deciding vertex positions, formulations decide the length of the segments representing the edges in each cycle in the graph, projected in every dimension. We propose an exact formulation and a relaxation based on a Eulerian cycle. We then compare computational results from protein conformation instances obtained with stochastic global optimization techniques on the new cycle-based formulation and on the existing edge-based formulation. While edge-based formulations take less time to reach termination, cycle-based formulations are generally better on solution quality measures.
△ Less
Submitted 28 July, 2023; v1 submitted 20 June, 2020;
originally announced June 2020.
-
Distance Geometry and Data Science
Authors:
Leo Liberti
Abstract:
Data are often represented as graphs. Many common tasks in data science are based on distances between entities. While some data science methodologies natively take graphs as their input, there are many more that take their input in vectorial form. In this survey we discuss the fundamental problem of map** graphs to vectors, and its relation with mathematical programming. We discuss applications…
▽ More
Data are often represented as graphs. Many common tasks in data science are based on distances between entities. While some data science methodologies natively take graphs as their input, there are many more that take their input in vectorial form. In this survey we discuss the fundamental problem of map** graphs to vectors, and its relation with mathematical programming. We discuss applications, solution methods, dimensional reduction techniques and some of their limits. We then present an application of some of these ideas to neural networks, showing that distance geometry techniques can give competitive performance with respect to more traditional graph-to-vector map**s.
△ Less
Submitted 18 September, 2019;
originally announced September 2019.
-
Open research areas in distance geometry
Authors:
Leo Liberti,
Carlile Lavor
Abstract:
Distance Geometry is based on the inverse problem that asks to find the positions of points, in a Euclidean space of given dimension, that are compatible with a given set of distances. We briefly introduce the field, and discuss some open and promising research areas.
Distance Geometry is based on the inverse problem that asks to find the positions of points, in a Euclidean space of given dimension, that are compatible with a given set of distances. We briefly introduce the field, and discuss some open and promising research areas.
△ Less
Submitted 3 October, 2016;
originally announced October 2016.
-
New error measures and methods for realizing protein graphs from distance data
Authors:
Claudia D'Ambrosio,
Ky Vu,
Carlile Lavor,
Leo Liberti,
Nelson Maculan
Abstract:
The interval Distance Geometry Problem (iDGP) consists in finding a realization in $\mathbb{R}^K$ of a simple undirected graph $G=(V,E)$ with nonnegative intervals assigned to the edges in such a way that, for each edge, the Euclidean distance between the realization of the adjacent vertices is within the edge interval bounds. In this paper, we focus on the application to the conformation of prote…
▽ More
The interval Distance Geometry Problem (iDGP) consists in finding a realization in $\mathbb{R}^K$ of a simple undirected graph $G=(V,E)$ with nonnegative intervals assigned to the edges in such a way that, for each edge, the Euclidean distance between the realization of the adjacent vertices is within the edge interval bounds. In this paper, we focus on the application to the conformation of proteins in space, which is a basic step in determining protein function: given interval estimations of some of the inter-atomic distances, find their shape. Among different families of methods for accomplishing this task, we look at mathematical programming based methods, which are well suited for dealing with intervals. The basic question we want to answer is: what is the best such method for the problem? The most meaningful error measure for evaluating solution quality is the coordinate root mean square deviation. We first introduce a new error measure which addresses a particular feature of protein backbones, i.e. many partial reflections also yield acceptable backbones. We then present a set of new and existing quadratic and semidefinite programming formulations of this problem, and a set of new and existing methods for solving these formulations. Finally, we perform a computational evaluation of all the feasible solver$+$formulation combinations according to new and existing error measures, finding that the best methodology is a new heuristic method based on multiplicative weights updates.
△ Less
Submitted 4 July, 2016;
originally announced July 2016.
-
Gaussian random projections for Euclidean membership problems
Authors:
Ky Vu,
Pierre-Louis Poirion,
Leo Liberti
Abstract:
We discuss the application of random projections to the fundamental problem of deciding whether a given point in a Euclidean space belongs to a given set. We show that, under a number of different assumptions, the feasibility and infeasibility of this problem are preserved with high probability when the problem data is projected to a lower dimensional space. Our results are applicable to any algor…
▽ More
We discuss the application of random projections to the fundamental problem of deciding whether a given point in a Euclidean space belongs to a given set. We show that, under a number of different assumptions, the feasibility and infeasibility of this problem are preserved with high probability when the problem data is projected to a lower dimensional space. Our results are applicable to any algorithmic setting which needs to solve Euclidean membership problems in a high-dimensional space.
△ Less
Submitted 18 November, 2015; v1 submitted 2 September, 2015;
originally announced September 2015.
-
Using the Johnson-Lindenstrauss lemma in linear and integer programming
Authors:
Ky Vu,
Pierre-Louis Poirion,
Leo Liberti
Abstract:
The Johnson-Lindenstrauss lemma allows dimension reduction on real vectors with low distortion on their pairwise Euclidean distances. This result is often used in algorithms such as $k$-means or $k$ nearest neighbours since they only use Euclidean distances, and has sometimes been used in optimization algorithms involving the minimization of Euclidean distances. In this paper we introduce a first…
▽ More
The Johnson-Lindenstrauss lemma allows dimension reduction on real vectors with low distortion on their pairwise Euclidean distances. This result is often used in algorithms such as $k$-means or $k$ nearest neighbours since they only use Euclidean distances, and has sometimes been used in optimization algorithms involving the minimization of Euclidean distances. In this paper we introduce a first attempt at using this lemma in the context of feasibility problems in linear and integer programming, which cannot be expressed only in function of Euclidean distances.
△ Less
Submitted 3 July, 2015;
originally announced July 2015.
-
Polynomial cases of the Discretizable Molecular Distance Geometry Problem
Authors:
Leo Liberti,
Carlile Lavor,
Benoit Masson,
Antonio Mucherino
Abstract:
An important application of distance geometry to biochemistry studies the embeddings of the vertices of a weighted graph in the three-dimensional Euclidean space such that the edge weights are equal to the Euclidean distances between corresponding point pairs. When the graph represents the backbone of a protein, one can exploit the natural vertex order to show that the search space for feasible em…
▽ More
An important application of distance geometry to biochemistry studies the embeddings of the vertices of a weighted graph in the three-dimensional Euclidean space such that the edge weights are equal to the Euclidean distances between corresponding point pairs. When the graph represents the backbone of a protein, one can exploit the natural vertex order to show that the search space for feasible embeddings is discrete. The corresponding decision problem can be solved using a binary tree based search procedure which is exponential in the worst case. We discuss assumptions that bound the search tree width to a polynomial size.
△ Less
Submitted 7 March, 2011;
originally announced March 2011.
-
On the number of solutions of the discretizable molecular distance geometry problem
Authors:
Leo Liberti,
Benoit Masson,
Jon Lee,
Carlile Lavor,
Antonio Mucherino
Abstract:
The Generalized Discretizable Molecular Distance Geometry Problem is a distance geometry problems that can be solved by a combinatorial algorithm called ``Branch-and-Prune''. It was observed empirically that the number of solutions of YES instances is always a power of two. We give a proof that this event happens with probability one.
The Generalized Discretizable Molecular Distance Geometry Problem is a distance geometry problems that can be solved by a combinatorial algorithm called ``Branch-and-Prune''. It was observed empirically that the number of solutions of YES instances is always a power of two. We give a proof that this event happens with probability one.
△ Less
Submitted 9 October, 2010;
originally announced October 2010.
-
Fast paths in large-scale dynamic road networks
Authors:
Giacomo Nannicini,
Philippe Baptiste,
Gilles Barbier,
Daniel Krob,
Leo Liberti
Abstract:
Efficiently computing fast paths in large scale dynamic road networks (where dynamic traffic information is known over a part of the network) is a practical problem faced by several traffic information service providers who wish to offer a realistic fast path computation to GPS terminal enabled vehicles. The heuristic solution method we propose is based on a highway hierarchy-based shortest path…
▽ More
Efficiently computing fast paths in large scale dynamic road networks (where dynamic traffic information is known over a part of the network) is a practical problem faced by several traffic information service providers who wish to offer a realistic fast path computation to GPS terminal enabled vehicles. The heuristic solution method we propose is based on a highway hierarchy-based shortest path algorithm for static large-scale networks; we maintain a static highway hierarchy and perform each query on the dynamically evaluated network.
△ Less
Submitted 27 June, 2007; v1 submitted 9 April, 2007;
originally announced April 2007.
-
Performance Comparison of Function Evaluation Methods
Authors:
Leo Liberti
Abstract:
We perform a comparison of the performance and efficiency of four different function evaluation methods: black-box functions, binary trees, $n$-ary trees and string parsing. The test consists in evaluating 8 different functions of two variables $x,y$ over 5000 floating point values of the pair $(x,y)$. The outcome of the test indicates that the $n$-ary tree representation of algebraic expression…
▽ More
We perform a comparison of the performance and efficiency of four different function evaluation methods: black-box functions, binary trees, $n$-ary trees and string parsing. The test consists in evaluating 8 different functions of two variables $x,y$ over 5000 floating point values of the pair $(x,y)$. The outcome of the test indicates that the $n$-ary tree representation of algebraic expressions is the fastest method, closely followed by black-box function method, then by binary trees and lastly by string parsing.
△ Less
Submitted 14 July, 2002; v1 submitted 6 June, 2002;
originally announced June 2002.