-
Metaheuristics and Large Language Models Join Forces: Towards an Integrated Optimization Approach
Authors:
Camilo Chacón Sartori,
Christian Blum,
Filippo Bistaffa,
Guillem Rodríguez Corominas
Abstract:
Since the rise of Large Language Models (LLMs) a couple of years ago, researchers in metaheuristics (MHs) have wondered how to use their power in a beneficial way within their algorithms. This paper introduces a novel approach that leverages LLMs as pattern recognition tools to improve MHs. The resulting hybrid method, tested in the context of a social network-based combinatorial optimization prob…
▽ More
Since the rise of Large Language Models (LLMs) a couple of years ago, researchers in metaheuristics (MHs) have wondered how to use their power in a beneficial way within their algorithms. This paper introduces a novel approach that leverages LLMs as pattern recognition tools to improve MHs. The resulting hybrid method, tested in the context of a social network-based combinatorial optimization problem, outperforms existing state-of-the-art approaches that combine machine learning with MHs regarding the obtained solution quality. By carefully designing prompts, we demonstrate that the output obtained from LLMs can be used as problem knowledge, leading to improved results. Lastly, we acknowledge LLMs' potential drawbacks and limitations and consider it essential to examine them to advance this type of research further.
△ Less
Submitted 28 May, 2024;
originally announced May 2024.
-
Quantum Convolutional Neural Networks for the detection of Gamma-Ray Bursts in the AGILE space mission data
Authors:
A. Rizzo,
N. Parmiggiani,
A. Bulgarelli,
A. Macaluso,
V. Fioretti,
L. Castaldini,
A. Di Piano,
G. Panebianco,
C. Pittori,
M. Tavani,
C. Sartori,
C. Burigana,
V. Cardone,
F. Farsian,
M. Meneghetti,
G. Murante,
R. Scaramella,
F. Schillirò,
V. Testa,
T. Trombetti
Abstract:
Quantum computing represents a cutting-edge frontier in artificial intelligence. It makes use of hybrid quantum-classical computation which tries to leverage quantum mechanic principles that allow us to use a different approach to deep learning classification problems. The work presented here falls within the context of the AGILE space mission, launched in 2007 by the Italian Space Agency. We impl…
▽ More
Quantum computing represents a cutting-edge frontier in artificial intelligence. It makes use of hybrid quantum-classical computation which tries to leverage quantum mechanic principles that allow us to use a different approach to deep learning classification problems. The work presented here falls within the context of the AGILE space mission, launched in 2007 by the Italian Space Agency. We implement different Quantum Convolutional Neural Networks (QCNN) that analyze data acquired by the instruments onboard AGILE to detect Gamma-Ray Bursts from sky maps or light curves. We use several frameworks such as TensorFlow-Quantum, Qiskit and PennyLane to simulate a quantum computer. We achieved an accuracy of 95.1% on sky maps with QCNNs, while the classical counterpart achieved 98.8% on the same data, using however hundreds of thousands more parameters.
△ Less
Submitted 22 April, 2024;
originally announced April 2024.
-
Large Language Models for the Automated Analysis of Optimization Algorithms
Authors:
Camilo Chacón Sartori,
Christian Blum,
Gabriela Ochoa
Abstract:
The ability of Large Language Models (LLMs) to generate high-quality text and code has fuelled their rise in popularity. In this paper, we aim to demonstrate the potential of LLMs within the realm of optimization algorithms by integrating them into STNWeb. This is a web-based tool for the generation of Search Trajectory Networks (STNs), which are visualizations of optimization algorithm behavior.…
▽ More
The ability of Large Language Models (LLMs) to generate high-quality text and code has fuelled their rise in popularity. In this paper, we aim to demonstrate the potential of LLMs within the realm of optimization algorithms by integrating them into STNWeb. This is a web-based tool for the generation of Search Trajectory Networks (STNs), which are visualizations of optimization algorithm behavior. Although visualizations produced by STNWeb can be very informative for algorithm designers, they often require a certain level of prior knowledge to be interpreted. In an attempt to bridge this knowledge gap, we have incorporated LLMs, specifically GPT-4, into STNWeb to produce extensive written reports, complemented by automatically generated plots, thereby enhancing the user experience and reducing the barriers to the adoption of this tool by the research community. Moreover, our approach can be expanded to other tools from the optimization community, showcasing the versatility and potential of LLMs in this field.
△ Less
Submitted 13 February, 2024;
originally announced February 2024.
-
MAQA: A Quantum Framework for Supervised Learning
Authors:
Antonio Macaluso,
Matthias Klusch,
Stefano Lodi,
Claudio Sartori
Abstract:
Quantum Machine Learning has the potential to improve traditional machine learning methods and overcome some of the main limitations imposed by the classical computing paradigm. However, the practical advantages of using quantum resources to solve pattern recognition tasks are still to be demonstrated.
This work proposes a universal, efficient framework that can reproduce the output of a plethor…
▽ More
Quantum Machine Learning has the potential to improve traditional machine learning methods and overcome some of the main limitations imposed by the classical computing paradigm. However, the practical advantages of using quantum resources to solve pattern recognition tasks are still to be demonstrated.
This work proposes a universal, efficient framework that can reproduce the output of a plethora of classical supervised machine learning algorithms exploiting quantum computation's advantages. The proposed framework is named Multiple Aggregator Quantum Algorithm (MAQA) due to its capability to combine multiple and diverse functions to solve typical supervised learning problems. In its general formulation, MAQA can be potentially adopted as the quantum counterpart of all those models falling into the scheme of aggregation of multiple functions, such as ensemble algorithms and neural networks. From a computational point of view, the proposed framework allows generating an exponentially large number of different transformations of the input at the cost of increasing the depth of the corresponding quantum circuit linearly. Thus, MAQA produces a model with substantial descriptive power to broaden the horizon of possible applications of quantum machine learning with a computational advantage over classical methods. As a second meaningful addition, we discuss the adoption of the proposed framework as hybrid quantum-classical and fault-tolerant quantum algorithm.
△ Less
Submitted 20 March, 2023;
originally announced March 2023.
-
Quantum Splines for Non-Linear Approximations
Authors:
Antonio Macaluso,
Luca Clissa,
Stefano Lodi,
Claudio Sartori
Abstract:
Quantum Computing offers a new paradigm for efficient computing and many AI applications could benefit from its potential boost in performance. However, the main limitation is the constraint to linear operations that hampers the representation of complex relationships in data. In this work, we propose an efficient implementation of quantum splines for non-linear approximation. In particular, we fi…
▽ More
Quantum Computing offers a new paradigm for efficient computing and many AI applications could benefit from its potential boost in performance. However, the main limitation is the constraint to linear operations that hampers the representation of complex relationships in data. In this work, we propose an efficient implementation of quantum splines for non-linear approximation. In particular, we first discuss possible parametrisations, and select the most convenient for exploiting the HHL algorithm to obtain the estimates of spline coefficients. Then, we investigate QSpline performance as an evaluation routine for some of the most popular activation functions adopted in ML. Finally, a detailed comparison with classical alternatives to the HHL is also presented.
△ Less
Submitted 9 March, 2023;
originally announced March 2023.
-
Enabling Non-Linear Quantum Operations through Variational Quantum Splines
Authors:
Matteo Antonio Inajetovic,
Filippo Orazi,
Antonio Macaluso,
Stefano Lodi,
Claudio Sartori
Abstract:
The postulates of quantum mechanics impose only unitary transformations on quantum states, which is a severe limitation for quantum machine learning algorithms. Quantum Splines (QSplines) have recently been proposed to approximate quantum activation functions to introduce non-linearity in quantum algorithms. However, QSplines make use of the HHL as a subroutine and require a fault-tolerant quantum…
▽ More
The postulates of quantum mechanics impose only unitary transformations on quantum states, which is a severe limitation for quantum machine learning algorithms. Quantum Splines (QSplines) have recently been proposed to approximate quantum activation functions to introduce non-linearity in quantum algorithms. However, QSplines make use of the HHL as a subroutine and require a fault-tolerant quantum computer to be correctly implemented. This work proposes the Generalised Hybrid Quantum Splines (GHQSplines), a novel method for approximating non-linear quantum activation functions using hybrid quantum-classical computation. The GHQSplines overcome the highly demanding requirements of the original QSplines in terms of quantum hardware and can be implemented using near-term quantum computers. Furthermore, the proposed method relies on a flexible problem representation for non-linear approximation and it is suitable to be embedded in existing quantum neural network architectures. In addition, we provide a practical implementation of the GHQSplines using Pennylane and show that our model outperforms the original QSplines in terms of quality of fitting.
△ Less
Submitted 4 December, 2023; v1 submitted 8 March, 2023;
originally announced March 2023.
-
Models and algorithms for simple disjunctive temporal problems
Authors:
Carlo S. Sartori,
Pieter Smet,
Greet Vanden Berghe
Abstract:
Simple temporal problems represent a powerful class of models capable of describing the temporal relations between events that arise in many real-world applications such as logistics, robot planning and management systems. The classic simple temporal problem permits each event to have only a single release and due date. In this paper, we focus on the case where events may have an arbitrarily large…
▽ More
Simple temporal problems represent a powerful class of models capable of describing the temporal relations between events that arise in many real-world applications such as logistics, robot planning and management systems. The classic simple temporal problem permits each event to have only a single release and due date. In this paper, we focus on the case where events may have an arbitrarily large number of release and due dates. This type of problem, however, has been referred to by various names. In order to simplify and standardize nomenclatures, we introduce the name Simple Disjunctive Temporal Problem. We provide three mathematical models to describe this problem using constraint programming and linear programming. To efficiently solve simple disjunctive temporal problems, we design two new algorithms inspired by previous research, both of which exploit the problem's structure to significantly reduce their space complexity. Additionally, we implement algorithms from the literature and provide the first in-depth empirical study comparing methods to solve simple disjunctive temporal problems across a wide range of experiments. Our analysis and conclusions offer guidance for future researchers and practitioners when tackling similar temporal constraint problems in new applications. All results, source code and instances are made publicly available to further assist future research.
△ Less
Submitted 6 February, 2023;
originally announced February 2023.
-
Self-Supervised Bernoulli Autoencoders for Semi-Supervised Hashing
Authors:
Ricardo Ñanculef,
Francisco Mena,
Antonio Macaluso,
Stefano Lodi,
Claudio Sartori
Abstract:
Semantic hashing is an emerging technique for large-scale similarity search based on representing high-dimensional data using similarity-preserving binary codes used for efficient indexing and search. It has recently been shown that variational autoencoders, with Bernoulli latent representations parametrized by neural nets, can be successfully trained to learn such codes in supervised and unsuperv…
▽ More
Semantic hashing is an emerging technique for large-scale similarity search based on representing high-dimensional data using similarity-preserving binary codes used for efficient indexing and search. It has recently been shown that variational autoencoders, with Bernoulli latent representations parametrized by neural nets, can be successfully trained to learn such codes in supervised and unsupervised scenarios, improving on more traditional methods thanks to their ability to handle the binary constraints architecturally. However, the scenario where labels are scarce has not been studied yet.
This paper investigates the robustness of hashing methods based on variational autoencoders to the lack of supervision, focusing on two semi-supervised approaches currently in use. The first augments the variational autoencoder's training objective to jointly model the distribution over the data and the class labels. The second approach exploits the annotations to define an additional pairwise loss that enforces consistency between the similarity in the code (Hamming) space and the similarity in the label space. Our experiments show that both methods can significantly increase the hash codes' quality. The pairwise approach can exhibit an advantage when the number of labelled points is large. However, we found that this method degrades quickly and loses its advantage when labelled samples decrease. To circumvent this problem, we propose a novel supervision method in which the model uses its label distribution predictions to implement the pairwise objective. Compared to the best baseline, this procedure yields similar performance in fully supervised settings but improves the results significantly when labelled data is scarce. Our code is made publicly available at https://github.com/amacaluso/SSB-VAE.
△ Less
Submitted 17 July, 2020;
originally announced July 2020.
-
Quantum Ensemble for Classification
Authors:
Antonio Macaluso,
Luca Clissa,
Stefano Lodi,
Claudio Sartori
Abstract:
A powerful way to improve performance in machine learning is to construct an ensemble that combines the predictions of multiple models. Ensemble methods are often much more accurate and lower variance than the individual classifiers that make them up but have high requirements in terms of memory and computational time. In fact, a large number of alternative algorithms is usually adopted, each requ…
▽ More
A powerful way to improve performance in machine learning is to construct an ensemble that combines the predictions of multiple models. Ensemble methods are often much more accurate and lower variance than the individual classifiers that make them up but have high requirements in terms of memory and computational time. In fact, a large number of alternative algorithms is usually adopted, each requiring to query all available data.
We propose a new quantum algorithm that exploits quantum superposition, entanglement and interference to build an ensemble of classification models. Thanks to the generation of the several quantum trajectories in superposition, we obtain $B$ transformations of the quantum state which encodes the training set in only $log\left(B\right)$ operations. This implies exponential growth of the ensemble size while increasing linearly the depth of the correspondent circuit. Furthermore, when considering the overall cost of the algorithm, we show that the training of a single weak classifier impacts additively the overall time complexity rather than multiplicatively, as it usually happens in classical ensemble methods.
We also present small-scale experiments on real-world datasets, defining a quantum version of the cosine classifier and using the IBM qiskit environment to show how the algorithms work.
△ Less
Submitted 18 January, 2022; v1 submitted 2 July, 2020;
originally announced July 2020.
-
${\mathcal L}^1$ limit solutions in impulsive control
Authors:
Monica Motta,
Caterina Sartori
Abstract:
We consider a nonlinear control system depending on two controls u and v, with dynamics affine in the (unbounded) derivative of u, and v appearing initially only in the drift term. Recently, motivated by applications to optimization problems lacking coercivity, [1] proposed a notion of generalized solution x for this system, called {\it limit solution,} associated to measurable u and v, and with u…
▽ More
We consider a nonlinear control system depending on two controls u and v, with dynamics affine in the (unbounded) derivative of u, and v appearing initially only in the drift term. Recently, motivated by applications to optimization problems lacking coercivity, [1] proposed a notion of generalized solution x for this system, called {\it limit solution,} associated to measurable u and v, and with u of possibly unbounded variation in [0,T]. As shown in [1], when u and x have bounded variation, such a solution (called in this case BV simple limit solution) coincides with the most used graph completion solution (see e.g. [6]). This correspondence has been extended in [24] to BV_loc u and trajectories (with bounded variation just on any [0,t] with t<T). Starting with an example of optimal control where the minimum does not exist in the class of limit solutions, we propose a notion of extended limit solution x, for which such a minimum exists. As a first result, we prove that extended and original limit solutions coincide in the special cases of BV and BV_loc inputs u (and solutions). Then we consider dynamics where the ordinary control v also appears in the non-drift terms. For the associated system we prove that, in the BV case, extended limit solutions coincide with graph completion solutions.
△ Less
Submitted 1 June, 2017;
originally announced June 2017.
-
Unbounded variation and solutions of impulsive control systems
Authors:
Monica Motta,
Caterina Sartori
Abstract:
We consider a control system with dynamics which are affine in the (unbounded) derivative of the control $u$. We introduce a notion of generalized solution $x$ on $[0,T]$ for controls $u$ of bounded total variation on $[0,t]$ for every $t<T$, but of possibly infinite variation on $[0,T]$. This solution has a simple representation formula based on the so-called graph completion approach, originally…
▽ More
We consider a control system with dynamics which are affine in the (unbounded) derivative of the control $u$. We introduce a notion of generalized solution $x$ on $[0,T]$ for controls $u$ of bounded total variation on $[0,t]$ for every $t<T$, but of possibly infinite variation on $[0,T]$. This solution has a simple representation formula based on the so-called graph completion approach, originally developed for BV controls.
We prove the well-posedness of this generalized solution by showing that $x$ is a limit solution, that is the pointwise limit of regular trajectories of the system. In particular, we single out the subset of limit solutions which is in one-to-one correspondence with the set of generalized solutions. The controls that we consider provide the natural setting for treating some questions on the controllability of the system and some optimal control problems with endpoint constraints and lack of coercivity.
△ Less
Submitted 4 May, 2017;
originally announced May 2017.
-
Network Slicing to Enable Scalability and Flexibility in 5G Mobile Networks
Authors:
P. Rost,
C. Mannweiler,
D. S. Michalopoulos,
C. Sartori,
V. Sciancalepore,
N. Sastry,
O. Holland,
S. Tayade,
B. Han,
D. Bega,
D. Aziz,
H. Bakker
Abstract:
We argue for network slicing as an efficient solution that addresses the diverse requirements of 5G mobile networks, thus providing the necessary flexibility and scalability associated with future network implementations. We elaborate on the challenges that emerge when we design 5G networks based on network slicing. We focus on the architectural aspects associated with the coexistence of dedicated…
▽ More
We argue for network slicing as an efficient solution that addresses the diverse requirements of 5G mobile networks, thus providing the necessary flexibility and scalability associated with future network implementations. We elaborate on the challenges that emerge when we design 5G networks based on network slicing. We focus on the architectural aspects associated with the coexistence of dedicated as well as shared slices in the network. In particular, we analyze the realization options of a flexible radio access network with focus on network slicing and their impact on the design of 5G mobile networks. In addition to the technical study, this paper provides an investigation of the revenue potential of network slicing, where the applications that originate from such concept and the profit capabilities from the network operator's perspective are put forward.
△ Less
Submitted 7 April, 2017;
originally announced April 2017.
-
Fast and Scalable Lasso via Stochastic Frank-Wolfe Methods with a Convergence Guarantee
Authors:
Emanuele Frandi,
Ricardo Nanculef,
Stefano Lodi,
Claudio Sartori,
Johan A. K. Suykens
Abstract:
Frank-Wolfe (FW) algorithms have been often proposed over the last few years as efficient solvers for a variety of optimization problems arising in the field of Machine Learning. The ability to work with cheap projection-free iterations and the incremental nature of the method make FW a very effective choice for many large-scale problems where computing a sparse model is desirable.
In this paper…
▽ More
Frank-Wolfe (FW) algorithms have been often proposed over the last few years as efficient solvers for a variety of optimization problems arising in the field of Machine Learning. The ability to work with cheap projection-free iterations and the incremental nature of the method make FW a very effective choice for many large-scale problems where computing a sparse model is desirable.
In this paper, we present a high-performance implementation of the FW method tailored to solve large-scale Lasso regression problems, based on a randomized iteration, and prove that the convergence guarantees of the standard FW method are preserved in the stochastic setting. We show experimentally that our algorithm outperforms several existing state of the art methods, including the Coordinate Descent algorithm by Friedman et al. (one of the fastest known Lasso solvers), on several benchmark datasets with a very large number of features, without sacrificing the accuracy of the model. Our results illustrate that the algorithm is able to generate the complete regularization path on problems of size up to four million variables in less than one minute.
△ Less
Submitted 24 October, 2015;
originally announced October 2015.
-
Asymptotic problems in optimal control with a vanishing Lagrangian and unbounded data
Authors:
Monica Motta,
Caterina Sartori
Abstract:
In this paper we give a representation formula for the limit of the fnite horizon problem as the horizon becomes infinite, with a nonnegative Lagrangian and unbounded data. It is related to the limit of the discounted infinite horizon problem, as the discount factor goes to zero. We give sufficient conditions to characterize the limit function as unique nonnegative solution of the associated HJB e…
▽ More
In this paper we give a representation formula for the limit of the fnite horizon problem as the horizon becomes infinite, with a nonnegative Lagrangian and unbounded data. It is related to the limit of the discounted infinite horizon problem, as the discount factor goes to zero. We give sufficient conditions to characterize the limit function as unique nonnegative solution of the associated HJB equation. We also briefly discuss the ergodic problem.
△ Less
Submitted 30 June, 2014;
originally announced June 2014.
-
The value function of an asymptotic exit-time optimal control problem
Authors:
Monica Motta,
Caterina Sartori
Abstract:
We consider a class of exit--time control problems for nonlinear systems with a nonnegative vanishing Lagrangian. In general, the associated PDE may have multiple solutions, and known regularity and stability properties do not hold. In this paper we obtain such properties and a uniqueness result under some explicit sufficient conditions. We briefly investigate also the infinite horizon problem.
We consider a class of exit--time control problems for nonlinear systems with a nonnegative vanishing Lagrangian. In general, the associated PDE may have multiple solutions, and known regularity and stability properties do not hold. In this paper we obtain such properties and a uniqueness result under some explicit sufficient conditions. We briefly investigate also the infinite horizon problem.
△ Less
Submitted 20 March, 2014; v1 submitted 28 December, 2013;
originally announced December 2013.
-
A Novel Frank-Wolfe Algorithm. Analysis and Applications to Large-Scale SVM Training
Authors:
Hector Allende,
Emanuele Frandi,
Ricardo Nanculef,
Claudio Sartori
Abstract:
Recently, there has been a renewed interest in the machine learning community for variants of a sparse greedy approximation procedure for concave optimization known as {the Frank-Wolfe (FW) method}. In particular, this procedure has been successfully applied to train large-scale instances of non-linear Support Vector Machines (SVMs). Specializing FW to SVM training has allowed to obtain efficient…
▽ More
Recently, there has been a renewed interest in the machine learning community for variants of a sparse greedy approximation procedure for concave optimization known as {the Frank-Wolfe (FW) method}. In particular, this procedure has been successfully applied to train large-scale instances of non-linear Support Vector Machines (SVMs). Specializing FW to SVM training has allowed to obtain efficient algorithms but also important theoretical results, including convergence analysis of training algorithms and new characterizations of model sparsity.
In this paper, we present and analyze a novel variant of the FW method based on a new way to perform away steps, a classic strategy used to accelerate the convergence of the basic FW procedure. Our formulation and analysis is focused on a general concave maximization problem on the simplex. However, the specialization of our algorithm to quadratic forms is strongly related to some classic methods in computational geometry, namely the Gilbert and MDM algorithms.
On the theoretical side, we demonstrate that the method matches the guarantees in terms of convergence rate and number of iterations obtained by using classic away steps. In particular, the method enjoys a linear rate of convergence, a result that has been recently proved for MDM on quadratic forms.
On the practical side, we provide experiments on several classification datasets, and evaluate the results using statistical tests. Experiments show that our method is faster than the FW method with classic away steps, and works well even in the cases in which classic away steps slow down the algorithm. Furthermore, these improvements are obtained without sacrificing the predictive accuracy of the obtained SVM model.
△ Less
Submitted 13 October, 2013; v1 submitted 3 April, 2013;
originally announced April 2013.
-
Training Support Vector Machines Using Frank-Wolfe Optimization Methods
Authors:
Emanuele Frandi,
Ricardo Nanculef,
Maria Grazia Gasparo,
Stefano Lodi,
Claudio Sartori
Abstract:
Training a Support Vector Machine (SVM) requires the solution of a quadratic programming problem (QP) whose computational complexity becomes prohibitively expensive for large scale datasets. Traditional optimization methods cannot be directly applied in these cases, mainly due to memory restrictions.
By adopting a slightly different objective function and under mild conditions on the kernel used…
▽ More
Training a Support Vector Machine (SVM) requires the solution of a quadratic programming problem (QP) whose computational complexity becomes prohibitively expensive for large scale datasets. Traditional optimization methods cannot be directly applied in these cases, mainly due to memory restrictions.
By adopting a slightly different objective function and under mild conditions on the kernel used within the model, efficient algorithms to train SVMs have been devised under the name of Core Vector Machines (CVMs). This framework exploits the equivalence of the resulting learning problem with the task of building a Minimal Enclosing Ball (MEB) problem in a feature space, where data is implicitly embedded by a kernel function.
In this paper, we improve on the CVM approach by proposing two novel methods to build SVMs based on the Frank-Wolfe algorithm, recently revisited as a fast method to approximate the solution of a MEB problem. In contrast to CVMs, our algorithms do not require to compute the solutions of a sequence of increasingly complex QPs and are defined by using only analytic optimization steps. Experiments on a large collection of datasets show that our methods scale better than CVMs in most cases, sometimes at the price of a slightly lower accuracy. As CVMs, the proposed methods can be easily extended to machine learning problems other than binary classification. However, effective classifiers are also obtained using kernels which do not satisfy the condition required by CVMs and can thus be used for a wider set of problems.
△ Less
Submitted 4 December, 2012;
originally announced December 2012.