-
Adaptive Stabilization Based on Machine Learning for Column Generation
Authors:
Yunzhuang Shen,
Yuan Sun,
Xiaodong Li,
Zhiguang Cao,
Andrew Eberhard,
Guangquan Zhang
Abstract:
Column generation (CG) is a well-established method for solving large-scale linear programs. It involves iteratively optimizing a subproblem containing a subset of columns and using its dual solution to generate new columns with negative reduced costs. This process continues until the dual values converge to the optimal dual solution to the original problem. A natural phenomenon in CG is the heavy…
▽ More
Column generation (CG) is a well-established method for solving large-scale linear programs. It involves iteratively optimizing a subproblem containing a subset of columns and using its dual solution to generate new columns with negative reduced costs. This process continues until the dual values converge to the optimal dual solution to the original problem. A natural phenomenon in CG is the heavy oscillation of the dual values during iterations, which can lead to a substantial slowdown in the convergence rate. Stabilization techniques are devised to accelerate the convergence of dual values by using information beyond the state of the current subproblem. However, there remains a significant gap in obtaining more accurate dual values at an earlier stage. To further narrow this gap, this paper introduces a novel approach consisting of 1) a machine learning approach for accurate prediction of optimal dual solutions and 2) an adaptive stabilization technique that effectively capitalizes on accurate predictions. On the graph coloring problem, we show that our method achieves a significantly improved convergence rate compared to traditional methods.
△ Less
Submitted 18 May, 2024;
originally announced May 2024.
-
Bregman Proximal Linearized ADMM for Minimizing Separable Sums Coupled by a Difference of Functions
Authors:
Tan Nhat Pham,
Minh N. Dao,
Andrew Eberhard,
Nargiz Sultanova
Abstract:
In this paper, we develop a splitting algorithm incorporating Bregman distances to solve a broad class of linearly constrained composite optimization problems, whose objective function is the separable sum of possibly nonconvex nonsmooth functions and a smooth function, coupled by a difference of functions. This structure encapsulates numerous significant nonconvex and nonsmooth optimization probl…
▽ More
In this paper, we develop a splitting algorithm incorporating Bregman distances to solve a broad class of linearly constrained composite optimization problems, whose objective function is the separable sum of possibly nonconvex nonsmooth functions and a smooth function, coupled by a difference of functions. This structure encapsulates numerous significant nonconvex and nonsmooth optimization problems in the current literature including the linearly constrained difference-of-convex problems. Relying on the successive linearization and alternating direction method of multipliers (ADMM), the proposed algorithm exhibits the global subsequential convergence to a stationary point of the underlying problem. We also establish the convergence of the full sequence generated by our algorithm under the Kurdyka-Lojasiewicz property and some mild assumptions. The efficiency of the proposed algorithm is tested on a robust principal component analysis problem and a nonconvex optimal power flow problem.
△ Less
Submitted 4 January, 2024;
originally announced January 2024.
-
Actively Learning Costly Reward Functions for Reinforcement Learning
Authors:
André Eberhard,
Houssam Metni,
Georg Fahland,
Alexander Stroh,
Pascal Friederich
Abstract:
Transfer of recent advances in deep reinforcement learning to real-world applications is hindered by high data demands and thus low efficiency and scalability. Through independent improvements of components such as replay buffers or more stable learning algorithms, and through massively distributed systems, training time could be reduced from several days to several hours for standard benchmark ta…
▽ More
Transfer of recent advances in deep reinforcement learning to real-world applications is hindered by high data demands and thus low efficiency and scalability. Through independent improvements of components such as replay buffers or more stable learning algorithms, and through massively distributed systems, training time could be reduced from several days to several hours for standard benchmark tasks. However, while rewards in simulated environments are well-defined and easy to compute, reward evaluation becomes the bottleneck in many real-world environments, e.g., in molecular optimization tasks, where computationally demanding simulations or even experiments are required to evaluate states and to quantify rewards. Therefore, training might become prohibitively expensive without an extensive amount of computational resources and time. We propose to alleviate this problem by replacing costly ground-truth rewards with rewards modeled by neural networks, counteracting non-stationarity of state and reward distributions during training with an active learning component. We demonstrate that using our proposed ACRL method (Actively learning Costly rewards for Reinforcement Learning), it is possible to train agents in complex real-world environments orders of magnitudes faster. By enabling the application of reinforcement learning methods to new domains, we show that we can find interesting and non-trivial solutions to real-world optimization problems in chemistry, materials science and engineering.
△ Less
Submitted 23 November, 2022;
originally announced November 2022.
-
Graph neural networks for materials science and chemistry
Authors:
Patrick Reiser,
Marlen Neubert,
André Eberhard,
Luca Torresi,
Chen Zhou,
Chen Shao,
Houssam Metni,
Clint van Hoesel,
Henrik Schopmans,
Timo Sommer,
Pascal Friederich
Abstract:
Machine learning plays an increasingly important role in many areas of chemistry and materials science, e.g. to predict materials properties, to accelerate simulations, to design new materials, and to predict synthesis routes of new materials. Graph neural networks (GNNs) are one of the fastest growing classes of machine learning models. They are of particular relevance for chemistry and materials…
▽ More
Machine learning plays an increasingly important role in many areas of chemistry and materials science, e.g. to predict materials properties, to accelerate simulations, to design new materials, and to predict synthesis routes of new materials. Graph neural networks (GNNs) are one of the fastest growing classes of machine learning models. They are of particular relevance for chemistry and materials science, as they directly work on a graph or structural representation of molecules and materials and therefore have full access to all relevant information required to characterize materials. In this review article, we provide an overview of the basic principles of GNNs, widely used datasets, and state-of-the-art architectures, followed by a discussion of a wide range of recent applications of GNNs in chemistry and materials science, and concluding with a road-map for the further development and application of GNNs.
△ Less
Submitted 5 August, 2022;
originally announced August 2022.
-
Adaptive Solution Prediction for Combinatorial Optimization
Authors:
Yunzhuang Shen,
Yuan Sun,
Xiaodong Li,
Andrew Eberhard,
Andreas Ernst
Abstract:
This paper aims to predict optimal solutions for combinatorial optimization problems (COPs) via machine learning (ML). To find high-quality solutions efficiently, existing work uses a ML prediction of the optimal solution to guide heuristic search, where the ML model is trained offline under the supervision of solved problem instances with known optimal solutions. To predict the optimal solution w…
▽ More
This paper aims to predict optimal solutions for combinatorial optimization problems (COPs) via machine learning (ML). To find high-quality solutions efficiently, existing work uses a ML prediction of the optimal solution to guide heuristic search, where the ML model is trained offline under the supervision of solved problem instances with known optimal solutions. To predict the optimal solution with sufficient accuracy, it is critical to provide a ML model with adequate features that can effectively characterize decision variables. However, acquiring such features is challenging due to the high complexity of COPs. This paper proposes a framework that can better characterize decision variables by harnessing feedback from a heuristic search over several iterative steps, enabling an offline-trained ML model to predict the optimal solution in an adaptive manner. We refer to this approach as adaptive solution prediction (ASP). Specifically, we employ a set of statistical measures as features, which can extract useful information from feasible solutions found by a heuristic search and inform the ML model as to which value a decision variable is likely to take in high-quality solutions. Our experiments on three NP-hard COPs show that ASP substantially improves the prediction quality of an offline-trained ML model and achieves competitive results compared to several heuristic methods in terms of solution quality. Furthermore, we demonstrate that ASP can be used as a heuristic-pricing method for column generation, to boost an exact branch-and-price algorithm for solving the graph coloring problem.
△ Less
Submitted 26 January, 2023; v1 submitted 19 April, 2022;
originally announced April 2022.
-
Enhancing Column Generation by a Machine-Learning-Based Pricing Heuristic for Graph Coloring
Authors:
Yunzhuang Shen,
Yuan Sun,
Xiaodong Li,
Andrew Eberhard,
Andreas Ernst
Abstract:
Column Generation (CG) is an effective method for solving large-scale optimization problems. CG starts by solving a sub-problem with a subset of columns (i.e., variables) and gradually includes new columns that can improve the solution of the current subproblem. The new columns are generated as needed by repeatedly solving a pricing problem, which is often NP-hard and is a bottleneck of the CG app…
▽ More
Column Generation (CG) is an effective method for solving large-scale optimization problems. CG starts by solving a sub-problem with a subset of columns (i.e., variables) and gradually includes new columns that can improve the solution of the current subproblem. The new columns are generated as needed by repeatedly solving a pricing problem, which is often NP-hard and is a bottleneck of the CG approach. To tackle this, we propose a Machine-Learning-based Pricing Heuristic (MLPH)that can generate many high-quality columns efficiently. In each iteration of CG, our MLPH leverages an ML model to predict the optimal solution of the pricing problem, which is then used to guide a sampling method to efficiently generate multiple high-quality columns. Using the graph coloring problem, we empirically show that MLPH significantly enhancesCG as compared to six state-of-the-art methods, and the improvement in CG can lead to substantially better performance of the branch-and-price exact method.
△ Less
Submitted 7 March, 2022; v1 submitted 7 December, 2021;
originally announced December 2021.
-
Learning Primal Heuristics for Mixed Integer Programs
Authors:
Yunzhuang Shen,
Yuan Sun,
Andrew Eberhard,
Xiaodong Li
Abstract:
This paper proposes a novel primal heuristic for Mixed Integer Programs, by employing machine learning techniques. Mixed Integer Programming is a general technique for formulating combinatorial optimization problems. Inside a solver, primal heuristics play a critical role in finding good feasible solutions that enable one to tighten the duality gap from the outset of the Branch-and-Bound algorithm…
▽ More
This paper proposes a novel primal heuristic for Mixed Integer Programs, by employing machine learning techniques. Mixed Integer Programming is a general technique for formulating combinatorial optimization problems. Inside a solver, primal heuristics play a critical role in finding good feasible solutions that enable one to tighten the duality gap from the outset of the Branch-and-Bound algorithm (B&B), greatly improving its performance by pruning the B&B tree aggressively. In this paper, we investigate whether effective primal heuristics can be automatically learned via machine learning. We propose a new method to represent an optimization problem as a graph, and train a Graph Convolutional Network on solved problem instances with known optimal solutions. This in turn can predict the values of decision variables in the optimal solution for an unseen problem instance of a similar type. The prediction of variable solutions is then leveraged by a novel configuration of the B&B method, Probabilistic Branching with guided Depth-first Search (PB-DFS) approach, aiming to find (near-)optimal solutions quickly. The experimental results show that this new heuristic can find better primal solutions at a much earlier stage of the solving process, compared to other state-of-the-art primal heuristics.
△ Less
Submitted 2 July, 2021;
originally announced July 2021.
-
Implementing graph neural networks with TensorFlow-Keras
Authors:
Patrick Reiser,
Andre Eberhard,
Pascal Friederich
Abstract:
Graph neural networks are a versatile machine learning architecture that received a lot of attention recently. In this technical report, we present an implementation of convolution and pooling layers for TensorFlow-Keras models, which allows a seamless and flexible integration into standard Keras layers to set up graph models in a functional way. This implies the usage of mini-batches as the first…
▽ More
Graph neural networks are a versatile machine learning architecture that received a lot of attention recently. In this technical report, we present an implementation of convolution and pooling layers for TensorFlow-Keras models, which allows a seamless and flexible integration into standard Keras layers to set up graph models in a functional way. This implies the usage of mini-batches as the first tensor dimension, which can be realized via the new RaggedTensor class of TensorFlow best suited for graphs. We developed the Keras Graph Convolutional Neural Network Python package kgcnn based on TensorFlow-Keras that provides a set of Keras layers for graph networks which focus on a transparent tensor structure passed between layers and an ease-of-use mindset.
△ Less
Submitted 7 March, 2021;
originally announced March 2021.
-
Convergence Rates for Boundedly Regular Systems
Authors:
Ernö Robert Csetnek,
Andrew Eberhard,
Matthew K. Tam
Abstract:
In this work, we consider a continuous dynamical system associated with the fixed point set of a nonexpansive operator which was originally studied by Boţ & Csetnek (2015). Our main results establish convergence rates for the system's trajectories when the nonexpansive operator satisfies an additional regularity property. This setting is the natural continuous-time analogue to discrete-time result…
▽ More
In this work, we consider a continuous dynamical system associated with the fixed point set of a nonexpansive operator which was originally studied by Boţ & Csetnek (2015). Our main results establish convergence rates for the system's trajectories when the nonexpansive operator satisfies an additional regularity property. This setting is the natural continuous-time analogue to discrete-time results obtained in Bauschke, Noll & Phan (2015) and Borwein, Li & Tam (2017) by using the same regularity properties.
△ Less
Submitted 7 June, 2021; v1 submitted 2 April, 2020;
originally announced April 2020.
-
Explicit Formula for Preimages of Relaxed One-Sided Lipschitz Map**s with Negative Lipschitz Constants
Authors:
Andrew C. Eberhard,
Boris S. Mordukhovich,
Janosch Rieger
Abstract:
This paper addresses Lipschitzian stability issues that play an important role in both theoretical and numerical aspects of variational analysis, optimization, and their applications. We particularly concentrate on the so-called relaxed one-sided Lipschitz property of set-valued map**s with negative Lipschitz constants. This property has been much less investigated than more conventional Lipschi…
▽ More
This paper addresses Lipschitzian stability issues that play an important role in both theoretical and numerical aspects of variational analysis, optimization, and their applications. We particularly concentrate on the so-called relaxed one-sided Lipschitz property of set-valued map**s with negative Lipschitz constants. This property has been much less investigated than more conventional Lipschitzian behavior while being well recognized in a variety of applications. Recent work has revealed that set-valued map**s satisfying the relaxed one-sided Lipschitz condition with negative Lipschitz constant possess a localization property that is stronger than uniform metric regularity. The present paper complements this fact by providing a characterization not only of one specific single point of a preimage, but of entire preimages of such map**s. Develo** a geometric approach, we derive an explicit formula to calculate preimages of relaxed one-sided Lipschitz map**s between finite-dimensional spaces and obtain a further specification of this formula via extreme points of image sets.
△ Less
Submitted 25 February, 2019; v1 submitted 4 February, 2019;
originally announced February 2019.
-
Combining Progressive Hedging with a Frank-Wolfe Method to Compute Lagrangian Dual Bounds in Stochastic Mixed-Integer Programming
Authors:
Natashia Boland,
Jeffrey Christiansen,
Brian Dandurand,
Andrew Eberhard,
Jeff Linderoth,
James Luedtke
Abstract:
We present a new primal-dual algorithm for computing the value of the Lagrangian dual of a stochastic mixed-integer program (SMIP) formed by relaxing its nonanticipativity constraints. This dual is widely used in decomposition methods for the solution of SMIPs. The algorithm relies on the well-known progressive hedging method, but unlike previous progressive hedging approaches for SMIP, our algori…
▽ More
We present a new primal-dual algorithm for computing the value of the Lagrangian dual of a stochastic mixed-integer program (SMIP) formed by relaxing its nonanticipativity constraints. This dual is widely used in decomposition methods for the solution of SMIPs. The algorithm relies on the well-known progressive hedging method, but unlike previous progressive hedging approaches for SMIP, our algorithm can be shown to converge to the optimal Lagrangian dual value. The key improvement in the new algorithm is an inner loop of optimized linearization steps, similar to those taken in the classical Frank-Wolfe method. Numerical results demonstrate that our new algorithm empirically outperforms the standard implementation of progressive hedging for obtaining bounds in SMIP.
△ Less
Submitted 2 February, 2017;
originally announced February 2017.
-
A parallelizable augmented Lagrangian method applied to large-scale non-convex-constrained optimization problems
Authors:
Brian Dandurand,
Natashia Boland,
Jeffrey Christiansen,
Andrew Eberhard,
Fabricio Oliveira
Abstract:
We contribute improvements to a Lagrangian dual solution approach applied to large-scale optimization problems whose objective functions are convex, continuously differentiable and possibly nonlinear, while the non-relaxed constraint set is compact but not necessarily convex. Such problems arise, for example, in the split-variable deterministic reformulation of stochastic mixed-integer optimizatio…
▽ More
We contribute improvements to a Lagrangian dual solution approach applied to large-scale optimization problems whose objective functions are convex, continuously differentiable and possibly nonlinear, while the non-relaxed constraint set is compact but not necessarily convex. Such problems arise, for example, in the split-variable deterministic reformulation of stochastic mixed-integer optimization problems. The dual solution approach needs to address the nonconvexity of the non-relaxed constraint set while being efficiently implementable in parallel. We adapt the augmented Lagrangian method framework to address the presence of nonconvexity in the non-relaxed constraint set and the need for efficient parallelization. The development of our approach is most naturally compared with the development of proximal bundle methods and especially with their use of serious step conditions. However, deviations from these developments allow for an improvement in efficiency with which parallelization can be utilized. Pivotal in our modification to the augmented Lagrangian method is the use of an integration of approaches based on the simplicial decomposition method (SDM) and the nonlinear block Gauss-Seidel (GS) method. An adaptation of a serious step condition associated with proximal bundle methods allows for the approximation tolerance to be automatically adjusted. Under mild conditions optimal dual convergence is proven, and we report computational results on test instances from the stochastic optimization literature. We demonstrate improvement in parallel speedup over a baseline parallel approach.
△ Less
Submitted 1 February, 2017;
originally announced February 2017.
-
A Relax-and-Decomposition Algorithm for a p-Robust Hub Location Problem
Authors:
Saeid Abbasi Parizi,
Mahdi Bashiri,
Andrew Eberhard
Abstract:
In this paper, a non-linear p-robust hub location problem is extended to a risky environment where augmented chance constraint with a min-max regret form is employed to consider network risk as one of the objectives. The model considers risk factors such as security, air pollution and congestion to design the robust hub network. A Monte-Carlo simulation based algorithm, namely, a sample average ap…
▽ More
In this paper, a non-linear p-robust hub location problem is extended to a risky environment where augmented chance constraint with a min-max regret form is employed to consider network risk as one of the objectives. The model considers risk factors such as security, air pollution and congestion to design the robust hub network. A Monte-Carlo simulation based algorithm, namely, a sample average approximation scheme is applied to select a set of efficient scenarios. The problem is then solved using a novel relax-and-decomposition heuristic based on the coupling of an accelerated Benders decomposition with a Lagrangian relaxation method. To improve the decomposition mechanism, a multi-Pareto cut version is applied in the proposed algorithm. In our numerical tests a modification of the well-known CAB data set is used with different levels of parameters uncertainty. The results demonstrate the capability of the proposed model to design a robust network. We also verify the accuracy of the sample average approximation method. Finally, the results of the proposed algorithm for different instances were compared to other solution approaches which confirm the efficiency of the proposed solution method.
△ Less
Submitted 31 January, 2017;
originally announced February 2017.
-
Combining Penalty-based and Gauss-Seidel Methods for solving Stochastic Mixed-Integer Problems
Authors:
Fabricio Oliveira,
Jeffrey Christiansen,
Brian Dandurand,
Andrew Eberhard
Abstract:
In this paper, we propose a novel decomposition approach for mixed-integer stochastic programming (SMIP) problems that is inspired by the combination of penalty-based Lagrangian and block Gauss-Seidel methods (PBGS). In this sense, PBGS is developed such that the inherent decomposable structure that SMIPs present can be exploited in a computationally efficient manner. The performance of the propos…
▽ More
In this paper, we propose a novel decomposition approach for mixed-integer stochastic programming (SMIP) problems that is inspired by the combination of penalty-based Lagrangian and block Gauss-Seidel methods (PBGS). In this sense, PBGS is developed such that the inherent decomposable structure that SMIPs present can be exploited in a computationally efficient manner. The performance of the proposed method is compared with the Progressive Hedging method (PH), which also can be viewed as a Lagrangian-based method for obtaining solutions for SMIP. Numerical experiments performed using instances from the literature illustrate the efficiency of the proposed method in terms of computational performance and solution quality.
△ Less
Submitted 31 January, 2017;
originally announced February 2017.
-
Outer limits of subdifferentials for min-max type functions
Authors:
Andrew Eberhard,
Vera Roshchina,
Tian Sang
Abstract:
We generalise the outer subdifferential constructon suggested by Cánovas, Henrion, López and Parra for max type functions to pointwise minima of regular Lipschitz functions. We also answer an open question about the relation between the outer subdifferential of the support of a regular function and the end set of its subdifferential posed by Li, Meng and Yang.
We generalise the outer subdifferential constructon suggested by Cánovas, Henrion, López and Parra for max type functions to pointwise minima of regular Lipschitz functions. We also answer an open question about the relation between the outer subdifferential of the support of a regular function and the end set of its subdifferential posed by Li, Meng and Yang.
△ Less
Submitted 29 August, 2017; v1 submitted 11 January, 2017;
originally announced January 2017.
-
On Partial Smoothness, Tilt Stability and the $\mathcal{VU}$--Decomposition
Authors:
Andrew Eberhard,
Yousong Luo,
Shuai Liu
Abstract:
Under the assumption of prox-regularity and the presence of a tilt stable local minimum we are able to show that a $\mathcal{VU}$ like decomposition gives rise to the existence of a smooth manifold on which the function in question coincides locally with a smooth function.
Under the assumption of prox-regularity and the presence of a tilt stable local minimum we are able to show that a $\mathcal{VU}$ like decomposition gives rise to the existence of a smooth manifold on which the function in question coincides locally with a smooth function.
△ Less
Submitted 5 April, 2017; v1 submitted 24 February, 2016;
originally announced February 2016.
-
The U-Lagrangian of a prox-regular function
Authors:
Shuai Liu,
Andrew Eberhard,
Yousong Luo
Abstract:
When restricted to a subspace, a nonsmooth function can be differentiable. It is known that for a nonsmooth convex function f and a point x, the Euclidean space can be decomposed into two subspaces: U, over which a special Lagrangian can be defined and has nice smooth properties and V, the orthogonal complement subspace of U. In this paper we generalize the definition of UV-decomposition and U-Lag…
▽ More
When restricted to a subspace, a nonsmooth function can be differentiable. It is known that for a nonsmooth convex function f and a point x, the Euclidean space can be decomposed into two subspaces: U, over which a special Lagrangian can be defined and has nice smooth properties and V, the orthogonal complement subspace of U. In this paper we generalize the definition of UV-decomposition and U-Lagrangian to the context of nonconvex functions, specifically that of a prox-regular function.
△ Less
Submitted 29 October, 2015;
originally announced October 2015.
-
A version of bundle method with linear programming
Authors:
Shuai Liu,
Andrew Eberhard,
Yousong Luo
Abstract:
Bundle methods have been intensively studied for solving both convex and nonconvex optimization problems. In most of the bundle methods developed thus far, at least one quadratic programming (QP) subproblem needs to be solved in each iteration. In this paper, we exploit the feasibility of develo** a bundle algorithm that only solves linear subproblems. We start from minimization of a convex func…
▽ More
Bundle methods have been intensively studied for solving both convex and nonconvex optimization problems. In most of the bundle methods developed thus far, at least one quadratic programming (QP) subproblem needs to be solved in each iteration. In this paper, we exploit the feasibility of develo** a bundle algorithm that only solves linear subproblems. We start from minimization of a convex function and show that the sequence of major iterations converge to a minimizer. For nonconvex functions we consider functions that are locally Lipschitz continuous and prox-regular on a bounded level set, and minimize the cutting-plane model over a trust region with infinity norm. The para-convexity of such functions allows us to use the locally convexified model and its convexity properties. Under some conditions and assumptions, we study the convergence of the proposed algorithm through the outer semicontinuity of the proximal map**. Encouraging results of preliminary numerical experiments on standard test sets are provided.
△ Less
Submitted 7 July, 2015; v1 submitted 5 February, 2015;
originally announced February 2015.
-
Orbital Geometry in Optimisation
Authors:
Andrew Eberhard,
Vera Roshchina
Abstract:
We discuss the use of group symmetries in optimisation, in particular with respect to the structure of subdifferential and projection operators. This allows us to generalise a classic result of Adrian Lewis regarding the characterisation of the subdifferential of a permutation invariant convex function to the characterisation of the proximal subdifferential of a Schur convex function that is invar…
▽ More
We discuss the use of group symmetries in optimisation, in particular with respect to the structure of subdifferential and projection operators. This allows us to generalise a classic result of Adrian Lewis regarding the characterisation of the subdifferential of a permutation invariant convex function to the characterisation of the proximal subdifferential of a Schur convex function that is invariant with respect to a finite reflection group. We are also able to simplify and generalise results on projections onto symmetric sets, in particular, we study projections on sparsity constraints used in sparse signal recovery and compressed sensing.
△ Less
Submitted 24 July, 2015; v1 submitted 29 October, 2014;
originally announced October 2014.