-
Supermodular Approximation of Norms and Applications
Authors:
Thomas Kesselheim,
Marco Molinaro,
Sahil Singla
Abstract:
Many classical problems in theoretical computer science involve norm, even if implicitly; for example, both XOS functions and downward-closed sets are equivalent to some norms. The last decade has seen a lot of interest in designing algorithms beyond the standard $\ell_p$ norms $\|\cdot \|_p$. Despite notable advancements, many existing methods remain tailored to specific problems, leaving a broad…
▽ More
Many classical problems in theoretical computer science involve norm, even if implicitly; for example, both XOS functions and downward-closed sets are equivalent to some norms. The last decade has seen a lot of interest in designing algorithms beyond the standard $\ell_p$ norms $\|\cdot \|_p$. Despite notable advancements, many existing methods remain tailored to specific problems, leaving a broader applicability to general norms less understood. This paper investigates the intrinsic properties of $\ell_p$ norms that facilitate their widespread use and seeks to abstract these qualities to a more general setting.
We identify supermodularity -- often reserved for combinatorial set functions and characterized by monotone gradients -- as a defining feature beneficial for $ \|\cdot\|_p^p$. We introduce the notion of $p$-supermodularity for norms, asserting that a norm is $p$-supermodular if its $p^{th}$ power function exhibits supermodularity. The association of supermodularity with norms offers a new lens through which to view and construct algorithms.
Our work demonstrates that for a large class of problems $p$-supermodularity is a sufficient criterion for develo** good algorithms. This is either by reframing existing algorithms for problems like Online Load-Balancing and Bandits with Knapsacks through a supermodular lens, or by introducing novel analyses for problems such as Online Covering, Online Packing, and Stochastic Probing. Moreover, we prove that every symmetric norm can be approximated by a $p$-supermodular norm. Together, these recover and extend several results from the literature, and support $p$-supermodularity as a unified theoretical framework for optimization challenges centered around norm-related problems.
△ Less
Submitted 21 June, 2024;
originally announced June 2024.
-
Feasibility of access EGI resources through the ESCAPE developed ESFRI Science Analysis Platform
Authors:
Giuliano Taffoni,
Sara Bertocco,
Dave Morris,
Manu Parra-Royón,
Klaas Kliffen,
Marco Molinaro,
John Swinbank,
Susana Sanchez Exposito
Abstract:
The EU ESCAPE project is develo** ESAP, ESFRI 1 Scientific Analysis Platform, as an API gateway that enables the seamless integration of independent services accessing distributed data and computing resources. In ESCAPE we are exploring the possibility of exploiting EGI's OpenStack cloud computing services through ESAP. In our contribution we briefly describe ESCAPE and ESAP, the the use cases,…
▽ More
The EU ESCAPE project is develo** ESAP, ESFRI 1 Scientific Analysis Platform, as an API gateway that enables the seamless integration of independent services accessing distributed data and computing resources. In ESCAPE we are exploring the possibility of exploiting EGI's OpenStack cloud computing services through ESAP. In our contribution we briefly describe ESCAPE and ESAP, the the use cases, the work done to automate a virtual machine creation in EGI's OpenStack cloud computing, drawbacks and possible solutions.
△ Less
Submitted 24 January, 2023;
originally announced January 2023.
-
Online and Bandit Algorithms Beyond $\ell_p$ Norms
Authors:
Thomas Kesselheim,
Marco Molinaro,
Sahil Singla
Abstract:
Vector norms play a fundamental role in computer science and optimization, so there is an ongoing effort to generalize existing algorithms to settings beyond $\ell_\infty$ and $\ell_p$ norms. We show that many online and bandit applications for general norms admit good algorithms as long as the norm can be approximated by a function that is ``gradient-stable'', a notion that we introduce. Roughly…
▽ More
Vector norms play a fundamental role in computer science and optimization, so there is an ongoing effort to generalize existing algorithms to settings beyond $\ell_\infty$ and $\ell_p$ norms. We show that many online and bandit applications for general norms admit good algorithms as long as the norm can be approximated by a function that is ``gradient-stable'', a notion that we introduce. Roughly it says that the gradient of the function should not drastically decrease (multiplicatively) in any component as we increase the input vector. We prove that several families of norms, including all monotone symmetric norms, admit a gradient-stable approximation, giving us the first online and bandit algorithms for these norm families.
In particular, our notion of gradient-stability gives $O\big(\log^2 (\text{dimension})\big)$-competitive algorithms for the symmetric norm generalizations of Online Generalized Load Balancing and Bandits with Knapsacks. Our techniques extend to applications beyond symmetric norms as well, e.g., to Online Vector Scheduling and to Online Generalized Assignment with Convex Costs. Some key properties underlying our applications that are implied by gradient-stable approximations are a ``smooth game inequality'' and an approximate converse to Jensen's inequality.
△ Less
Submitted 24 October, 2022;
originally announced October 2022.
-
Online Demand Scheduling with Failovers
Authors:
Konstantina Mellou,
Marco Molinaro,
Rudy Zhou
Abstract:
Motivated by cloud computing applications, we study the problem of how to optimally deploy new hardware subject to both power and robustness constraints. To model the situation observed in large-scale data centers, we introduce the Online Demand Scheduling with Failover problem. There are $m$ identical devices with capacity constraints. Demands come one-by-one and, to be robust against a device fa…
▽ More
Motivated by cloud computing applications, we study the problem of how to optimally deploy new hardware subject to both power and robustness constraints. To model the situation observed in large-scale data centers, we introduce the Online Demand Scheduling with Failover problem. There are $m$ identical devices with capacity constraints. Demands come one-by-one and, to be robust against a device failure, need to be assigned to a pair of devices. When a device fails (in a failover scenario), each demand assigned to it is rerouted to its paired device (which may now run at increased capacity). The goal is to assign demands to the devices to maximize the total utilization subject to both the normal capacity constraints as well as these novel failover constraints. These latter constraints introduce new decision tradeoffs not present in classic assignment problems such as the Multiple Knapsack problem and AdWords.
In the worst-case model, we design a deterministic $\approx \frac{1}{2}$-competitive algorithm, and show this is essentially tight. To circumvent this constant-factor loss, which in the context of big cloud providers represents substantial capital losses, we consider the stochastic arrival model, where all demands come i.i.d. from an unknown distribution. In this model we design an algorithm that achieves a sub-linear additive regret (i.e. as OPT or $m$ increases, the multiplicative competitive ratio goes to $1$). This requires a combination of different techniques, including a configuration LP with a non-trivial post-processing step and an online monotone matching procedure introduced by Rhee and Talagrand.
△ Less
Submitted 1 September, 2022;
originally announced September 2022.
-
Time-Constrained Learning
Authors:
Sergio Filho,
Eduardo Laber,
Pedro Lazera,
Marco Molinaro
Abstract:
Consider a scenario in which we have a huge labeled dataset ${\cal D}$ and a limited time to train some given learner using ${\cal D}$. Since we may not be able to use the whole dataset, how should we proceed? Questions of this nature motivate the definition of the Time-Constrained Learning Task (TCL): Given a dataset ${\cal D}$ sampled from an unknown distribution $μ$, a learner ${\cal L}$ and a…
▽ More
Consider a scenario in which we have a huge labeled dataset ${\cal D}$ and a limited time to train some given learner using ${\cal D}$. Since we may not be able to use the whole dataset, how should we proceed? Questions of this nature motivate the definition of the Time-Constrained Learning Task (TCL): Given a dataset ${\cal D}$ sampled from an unknown distribution $μ$, a learner ${\cal L}$ and a time limit $T$, the goal is to obtain in at most $T$ units of time the classification model with highest possible accuracy w.r.t. to $μ$, among those that can be built by ${\cal L}$ using the dataset ${\cal D}$.
We propose TCT, an algorithm for the TCL task designed based that on principles from Machine Teaching. We present an experimental study involving 5 different Learners and 20 datasets where we show that TCT consistently outperforms two other algorithms: the first is a Teacher for black-box learners proposed in [Dasgupta et al., ICML 19] and the second is a natural adaptation of random sampling for the TCL setting. We also compare TCT with Stochastic Gradient Descent training -- our method is again consistently better.
While our work is primarily practical, we also show that a stripped-down version of TCT has provable guarantees. Under reasonable assumptions, the time our algorithm takes to achieve a certain accuracy is never much bigger than the time it takes the batch teacher (which sends a single batch of examples) to achieve similar accuracy, and in some case it is almost exponentially better.
△ Less
Submitted 3 February, 2022;
originally announced February 2022.
-
Robust Secretary and Prophet Algorithms for Packing Integer Programs
Authors:
C. J. Argue,
Anupam Gupta,
Marco Molinaro,
Sahil Singla
Abstract:
We study the problem of solving Packing Integer Programs (PIPs) in the online setting, where columns in $[0,1]^d$ of the constraint matrix are revealed sequentially, and the goal is to pick a subset of the columns that sum to at most $B$ in each coordinate while maximizing the objective. Excellent results are known in the secretary setting, where the columns are adversarially chosen, but presented…
▽ More
We study the problem of solving Packing Integer Programs (PIPs) in the online setting, where columns in $[0,1]^d$ of the constraint matrix are revealed sequentially, and the goal is to pick a subset of the columns that sum to at most $B$ in each coordinate while maximizing the objective. Excellent results are known in the secretary setting, where the columns are adversarially chosen, but presented in a uniformly random order. However, these existing algorithms are susceptible to adversarial attacks: they try to "learn" characteristics of a good solution, but tend to over-fit to the model, and hence a small number of adversarial corruptions can cause the algorithm to fail.
In this paper, we give the first robust algorithms for Packing Integer Programs, specifically in the recently proposed Byzantine Secretary framework. Our techniques are based on a two-level use of online learning, to robustly learn an approximation to the optimal value, and then to use this robust estimate to pick a good solution. These techniques are general and we use them to design robust algorithms for PIPs in the prophet model as well, specifically in the Prophet-with-Augmentations framework. We also improve known results in the Byzantine Secretary framework: we make the non-constructive results algorithmic and improve the existing bounds for single-item and matroid constraints.
△ Less
Submitted 23 December, 2021;
originally announced December 2021.
-
Lipschitz Selectors may not Yield Competitive Algorithms for Convex Body Chasing
Authors:
C. J. Argue,
Anupam Gupta,
Marco Molinaro
Abstract:
The current best algorithms for convex body chasing problem in online algorithms use the notion of the Steiner point of a convex set. In particular, the algorithm which always moves to the Steiner point of the request set is $O(d)$ competitive for nested convex body chasing, and this is optimal among memoryless algorithms [Bubeck et al. 2020]. A memoryless algorithm coincides with the notion of a…
▽ More
The current best algorithms for convex body chasing problem in online algorithms use the notion of the Steiner point of a convex set. In particular, the algorithm which always moves to the Steiner point of the request set is $O(d)$ competitive for nested convex body chasing, and this is optimal among memoryless algorithms [Bubeck et al. 2020]. A memoryless algorithm coincides with the notion of a selector in functional analysis. The Steiner point is noted for being Lipschitz with respect to the Hausdorff metric, and for achieving the minimal Lipschitz constant possible. It is natural to ask whether every selector with this Lipschitz property yields a competitive algorithm for nested convex body chasing. We answer this question in the negative by exhibiting a selector which yields a non-competitive algorithm for nested convex body chasing but is Lipschitz with respect to Hausdorff distance. Furthermore, we show that being Lipschitz with respect to an $L_p$-type analog to the Hausdorff distance is sufficient to guarantee competitiveness if and only if $p=1$.
△ Less
Submitted 8 February, 2022; v1 submitted 15 April, 2021;
originally announced April 2021.
-
Lower Bounds on the Size of General Branch-and-Bound Trees
Authors:
Santanu S. Dey,
Yatharth Dubey,
Marco Molinaro
Abstract:
A \emph{general branch-and-bound tree} is a branch-and-bound tree which is allowed to use general disjunctions of the form $π^{\top} x \leq π_0 \,\vee\, π^{\top}x \geq π_0 + 1$, where $π$ is an integer vector and $π_0$ is an integer scalar, to create child nodes. We construct a packing instance, a set covering instance, and a Traveling Salesman Problem instance, such that any general branch-and-bo…
▽ More
A \emph{general branch-and-bound tree} is a branch-and-bound tree which is allowed to use general disjunctions of the form $π^{\top} x \leq π_0 \,\vee\, π^{\top}x \geq π_0 + 1$, where $π$ is an integer vector and $π_0$ is an integer scalar, to create child nodes. We construct a packing instance, a set covering instance, and a Traveling Salesman Problem instance, such that any general branch-and-bound tree that solves these instances must be of exponential size. We also verify that an exponential lower bound on the size of general branch-and-bound trees persists when we add Gaussian noise to the coefficients of the cross polytope, thus showing that polynomial-size "smoothed analysis" upper bound is not possible. The results in this paper can be viewed as the branch-and-bound analog of the seminal paper by Chvátal et al. \cite{chvatal1989cutting}, who proved lower bounds for the Chvátal-Gomory rank.
△ Less
Submitted 19 January, 2022; v1 submitted 17 March, 2021;
originally announced March 2021.
-
Robust Algorithms for Online Convex Problems via Primal-Dual
Authors:
Marco Molinaro
Abstract:
Primal-dual methods in online optimization give several of the state-of-the art results in both of the most common models: adversarial and stochastic/random order. Here we try to provide a more unified analysis of primal-dual algorithms to better understand the mechanisms behind this important method. With this we are able of recover and extend in one goal several results of the literature.
In p…
▽ More
Primal-dual methods in online optimization give several of the state-of-the art results in both of the most common models: adversarial and stochastic/random order. Here we try to provide a more unified analysis of primal-dual algorithms to better understand the mechanisms behind this important method. With this we are able of recover and extend in one goal several results of the literature.
In particular, we obtain robust online algorithm for fairly general online convex problems: we consider the MIXED model where in some of the time steps the data is stochastic and in the others the data is adversarial. Both the quantity and location of the adversarial time steps are unknown to the algorithm. The guarantees of our algorithms interpolate between the (close to) best guarantees for each of the pure models. In particular, the presence of adversarial times does not degrade the guarantee relative to the stochastic part of the instance.
Concretely, we first consider Online Convex Programming: at each time a feasible set $V_t$ is revealed, and the algorithm needs to select $v_t \in V_t$ to minimize the total cost $ψ(\sum_t v_t)$, for a convex function $ψ$. Our robust primal-dual algorithm for this problem on the MIXED model recovers and extends, for example, a result of Gupta et al. and recent work on $\ell_p$-norm load balancing by the author. We also consider the problem of Welfare Maximization with Convex Production Costs: at each time a customer presents a value $c_t$ and resource consumption vector $a_t$, and the goal is to fractionally select customers to maximize the profit $\sum_t c_t x_t - ψ(\sum_t a_t x_t)$. Our robust primal-dual algorithm on the MIXED model recovers and extends the result of Azar et al.
Given the ubiquity of primal-dual algorithms we hope the ideas presented here will be useful in obtaining other robust algorithm in the MIXED or related models.
△ Less
Submitted 2 November, 2020;
originally announced November 2020.
-
Branch-and-Bound Solves Random Binary IPs in Polytime
Authors:
Santanu S. Dey,
Yatharth Dubey,
Marco Molinaro
Abstract:
Branch-and-bound is the workhorse of all state-of-the-art mixed integer linear programming (MILP) solvers. These implementations of branch-and-bound typically use variable branching, that is, the child nodes are obtained by fixing some variable to an integer value $v$ in one node and to $v + 1$ in the other node. Even though modern MILP solvers are able to solve very large-scale instances efficien…
▽ More
Branch-and-bound is the workhorse of all state-of-the-art mixed integer linear programming (MILP) solvers. These implementations of branch-and-bound typically use variable branching, that is, the child nodes are obtained by fixing some variable to an integer value $v$ in one node and to $v + 1$ in the other node. Even though modern MILP solvers are able to solve very large-scale instances efficiently, relatively little attention has been given to understanding why the underlying branch-and-bound algorithm performs so well. In this paper our goal is to theoretically analyze the performance of the standard variable branching based branch-and-bound algorithm. In order to avoid the exponential worst-case lower bounds, we follow the common idea of considering random instances. More precisely, we consider random integer programs where the entries of the coefficient matrix and the objective function are randomly sampled.
Our main result is that with good probability branch-and-bound with variable branching explores only a polynomial number of nodes to solve these instances, for a fixed number of constraints. To the best of our knowledge this is the first known such result for a standard version of branch-and-bound. We believe that this result provides a compelling indication of why branch-and-bound with variable branching works so well in practice.
△ Less
Submitted 16 September, 2021; v1 submitted 29 July, 2020;
originally announced July 2020.
-
Knapsack Secretary with Bursty Adversary
Authors:
Thomas Kesselheim,
Marco Molinaro
Abstract:
The random-order or secretary model is one of the most popular beyond-worst case model for online algorithms. While it avoids the pessimism of the traditional adversarial model, in practice we cannot expect the input to be presented in perfectly random order. This has motivated research on ``best of both worlds'' (algorithms with good performance on both purely stochastic and purely adversarial in…
▽ More
The random-order or secretary model is one of the most popular beyond-worst case model for online algorithms. While it avoids the pessimism of the traditional adversarial model, in practice we cannot expect the input to be presented in perfectly random order. This has motivated research on ``best of both worlds'' (algorithms with good performance on both purely stochastic and purely adversarial inputs), or even better, on inputs that are a mix of both stochastic and adversarial parts. Unfortunately the latter seems much harder to achieve and very few results of this type are known.
Towards advancing our understanding of designing such robust algorithms, we propose a random-order model with bursts of adversarial time steps. The assumption of burstiness of unexpected patterns is reasonable in many contexts, since changes (e.g. spike in a demand for a good) are often triggered by a common external event. We then consider the Knapsack Secretary problem in this model: there is a knapsack of size $k$ (e.g., available quantity of a good), and in each of the $n$ time steps an item comes with its value and size in $[0,1]$ and the algorithm needs to make an irrevocable decision whether to accept or reject the item.
We design an algorithm that gives an approximation of $1 - \tilde{O}(Γ/k)$ when the adversarial time steps can be covered by $Γ\ge \sqrt{k}$ intervals of size $\tilde{O}(\frac{n}{k})$. In particular, setting $Γ= \sqrt{k}$ gives a $(1 - O(\frac{\ln^2 k}{\sqrt{k}}))$-approximation that is resistant to up to a $\frac{\ln^2 k}{\sqrt{k}}$-fraction of the items being adversarial, which is almost optimal even in the absence of adversarial items. Also, setting $Γ= \tildeΩ(k)$ gives a constant approximation that is resistant to up to a constant fraction of items being adversarial.
△ Less
Submitted 20 June, 2020;
originally announced June 2020.
-
Curvature of Feasible Sets in Offline and Online Optimization
Authors:
Marco Molinaro
Abstract:
It is known that the curvature of the feasible set in convex optimization allows for algorithms with better convergence rates, and there has been renewed interest in this topic both for offline as well as online problems. In this paper, leveraging results on geometry and convex analysis, we further our understanding of the role of curvature in optimization:
- We first show the equivalence of two…
▽ More
It is known that the curvature of the feasible set in convex optimization allows for algorithms with better convergence rates, and there has been renewed interest in this topic both for offline as well as online problems. In this paper, leveraging results on geometry and convex analysis, we further our understanding of the role of curvature in optimization:
- We first show the equivalence of two notions of curvature, namely strong convexity and gauge bodies, proving a conjecture of Abernethy et al. As a consequence, this show that the Frank-Wolfe-type method of Wang and Abernethy has accelerated convergence rate $O(\frac{1}{t^2})$ over strongly convex feasible sets without additional assumptions on the (convex) objective function.
- In Online Linear Optimization, we identify two main properties that help explaining \emph{why/when} Follow the Leader (FTL) has only logarithmic regret over strongly convex sets. This allows one to directly recover a recent result of Huang et al., and to show that FTL has logarithmic regret over strongly convex sets whenever the gain vectors are non-negative.
- We provide an efficient procedure for approximating convex bodies by strongly convex ones while smoothly trading off approximation error and curvature. This allows one to extend the improved algorithms over strongly convex sets to general convex sets. As a concrete application, we extend the results of Dekel et al. on Online Linear Optimization with Hints to general convex sets.
△ Less
Submitted 10 May, 2021; v1 submitted 8 February, 2020;
originally announced February 2020.
-
Exposing SED Models And Snapshots Via VO Simulation Artefacts
Authors:
Chaitra,
Sara Bertocco,
Marco Molinaro,
Sergio Molinari,
Antonio Ragagnin,
Giuliano Taffoni
Abstract:
The Virtual Observatory (VO) simulation standards, Simulation Data Model (SimDM) and Simulation Data Access Layer (SimDAL), establish a framework for the discoverability and dissemination of data created in simulation projects. These standards address the complexity of having a standard access and facade for data which is expected to be multifaceted and, of a diverse range. In this paper, we detai…
▽ More
The Virtual Observatory (VO) simulation standards, Simulation Data Model (SimDM) and Simulation Data Access Layer (SimDAL), establish a framework for the discoverability and dissemination of data created in simulation projects. These standards address the complexity of having a standard access and facade for data which is expected to be multifaceted and, of a diverse range. In this paper, we detail the realisation of an application exposing the theoretical products of one such scientific project via the simulation facades proposed by the VO. The scientific project in question, is a study of the evolution of young clusters in dense molecular clumps. The theoretical products arising from this study include a grid of 20 million SED (Spectral Energy Distribution) models for synthetic young clusters and related data products. Details on the implementation of SimDAL components in the application as well as the ways in which the data structures of SimDM are incorporated onto the existing data products are provided.
△ Less
Submitted 21 November, 2019;
originally announced November 2019.
-
$k$-Servers with a Smile: Online Algorithms via Projections
Authors:
Niv Buchbinder,
Anupam Gupta,
Marco Molinaro,
Joseph,
Naor
Abstract:
We consider the $k$-server problem on trees and HSTs. We give an algorithm based on Bregman projections. This algorithm has a competitive ratios that match some of the recent results given by Bubeck et al. (STOC 2018), whose algorithm was based on mirror-descent-based continuous dynamics prescribed via a differential inclusion.
We consider the $k$-server problem on trees and HSTs. We give an algorithm based on Bregman projections. This algorithm has a competitive ratios that match some of the recent results given by Bubeck et al. (STOC 2018), whose algorithm was based on mirror-descent-based continuous dynamics prescribed via a differential inclusion.
△ Less
Submitted 19 December, 2018; v1 submitted 17 October, 2018;
originally announced October 2018.
-
Stochastic $\ell_p$ Load Balancing and Moment Problems via the $L$-Function Method
Authors:
Marco Molinaro
Abstract:
This paper considers stochastic optimization problems whose objective functions involve powers of random variables. For example, consider the classic Stochastic lp Load Balancing Problem (SLBp): There are $m$ machines and $n$ jobs, and known independent random variables $Y_{ij}$ decribe the load incurred on machine $i$ if we assign job $j$ to it. The goal is to assign each jobs to machines in orde…
▽ More
This paper considers stochastic optimization problems whose objective functions involve powers of random variables. For example, consider the classic Stochastic lp Load Balancing Problem (SLBp): There are $m$ machines and $n$ jobs, and known independent random variables $Y_{ij}$ decribe the load incurred on machine $i$ if we assign job $j$ to it. The goal is to assign each jobs to machines in order to minimize the expected $l_p$-norm of the total load on the machines. While convex relaxations represent one of the most powerful algorithmic tools, in problems such as SLBp the main difficulty is to capture the objective function in a way that only depends on each random variable separately.
We show how to capture $p$-power-type objectives in such separable way by using the $L$-function method, introduced by Latała to relate the moment of sums of random variables to the individual marginals. We show how this quickly leads to a constant-factor approximation for very general subset selection problem with $p$-moment objective.
Moreover, we give a constant-factor approximation for SLBp, improving on the recent $O(p/\ln p)$-approximation of [Gupta et al., SODA 18]. Here the application of the method is much more involved. In particular, we need to sharply connect the expected $l_p$-norm of a random vector with the $p$-moments of its marginals (machine loads), taking into account simultaneously the different scales of the loads that are incurred by an unknown assignment.
△ Less
Submitted 11 October, 2018;
originally announced October 2018.
-
Italian center for Astronomical Archives publishing solution: modular and distributed
Authors:
Marco Molinaro,
Nicola F. Calabria,
Robert Butora,
Sonia Zorba,
Riccardo Smareglia
Abstract:
The Italian center for Astronomical Archives tries to provide astronomical data resources as interoperable services based on IVOA standards. Its VO expertise and knowledge comes from active participation within IVOA and VO at European and international level, with a double-fold goal: learn from the collaboration and provide inputs to the community. The first solution to build an easy to configure…
▽ More
The Italian center for Astronomical Archives tries to provide astronomical data resources as interoperable services based on IVOA standards. Its VO expertise and knowledge comes from active participation within IVOA and VO at European and international level, with a double-fold goal: learn from the collaboration and provide inputs to the community. The first solution to build an easy to configure and maintain resource publisher conformant to VO standards proved to be too optimistic. For this reason it has been necessary to re-think the architecture with a modular system built around the messaging concept, where each modular component speaks to the other interested parties through a system of broker-managed queues. The first implemented protocol, the Simple Cone Search, shows the messaging task architecture connecting the parametric HTTP interface to the database backend access module, the logging module, and allows multiple cone search resources to be managed together through a configuration manager module. Even if relatively young, it already proved the flexibility required by the overall system when the database backend changed from MySQL to PostgreSQL+PgSphere. Another implementation test has been made to leverage task distribution over multiple servers to serve simultaneously: FITS cubes direct linking, cubes cutout and cubes positional merging. Currently the implementation of the SIA-2.0 standard protocol is ongoing while for TAP we will be adapting the TAPlib library. Alongside these tools a first administration tool (TASMAN) has been developed to ease the build up and maintenance of TAP_SCHEMA-ta including also ObsCore maintenance capability. Future work will be devoted at widening the range of VO protocols covered by the set of available modules, improve the configuration management and develop specific purpose modules common to all the service components.
△ Less
Submitted 21 May, 2018;
originally announced May 2018.
-
Maximizing Profit with Convex Costs in the Random-order Model
Authors:
Anupam Gupta,
Ruta Mehta,
Marco Molinaro
Abstract:
Suppose a set of requests arrives online: each request gives some value $v_i$ if accepted, but requires using some amount of each of $d$ resources. Our cost is a convex function of the vector of total utilization of these $d$ resources. Which requests should be accept to maximize our profit, i.e., the sum of values of the accepted demands, minus the convex cost?
We consider this problem in the r…
▽ More
Suppose a set of requests arrives online: each request gives some value $v_i$ if accepted, but requires using some amount of each of $d$ resources. Our cost is a convex function of the vector of total utilization of these $d$ resources. Which requests should be accept to maximize our profit, i.e., the sum of values of the accepted demands, minus the convex cost?
We consider this problem in the random-order a.k.a. secretary model, and show an $O(d)$-competitive algorithm for the case where the convex cost function is also supermodular. If the set of accepted demands must also be independent in a given matroid, we give an $O(d^3 α)$-competitive algorithm for the supermodular case, and an improved $O(d^2α)$ if the convex cost function is also separable. Here $α$ is the competitive ratio of the best algorithm for the submodular secretary problem. These extend and improve previous results known for this problem. Our techniques are simple but use powerful ideas from convex duality, which give clean interpretations of existing work, and allow us to give the extensions and improvements.
△ Less
Submitted 22 April, 2018;
originally announced April 2018.
-
Online and Random-order Load Balancing Simultaneously
Authors:
Marco Molinaro
Abstract:
We consider the problem of online load balancing under lp-norms: sequential jobs need to be assigned to one of the machines and the goal is to minimize the lp-norm of the machine loads. This generalizes the classical problem of scheduling for makespan minimization (case l_infty) and has been thoroughly studied. However, despite the recent push for beyond worst-case analyses, no such results are kn…
▽ More
We consider the problem of online load balancing under lp-norms: sequential jobs need to be assigned to one of the machines and the goal is to minimize the lp-norm of the machine loads. This generalizes the classical problem of scheduling for makespan minimization (case l_infty) and has been thoroughly studied. However, despite the recent push for beyond worst-case analyses, no such results are known for this problem. In this paper we provide algorithms with simultaneous guarantees for the worst-case model as well as for the random-order (i.e. secretary) model, where an arbitrary set of jobs comes in random order. First, we show that the greedy algorithm (with restart), known to have optimal O(p) worst-case guarantee, also has a (typically) improved random-order guarantee. However, the behavior of this algorithm in the random-order model degrades with p. We then propose algorithm SIMULTANEOUSLB that has simultaneously optimal guarantees (within constants) in both worst-case and random-order models. In particular, the random-order guarantee of SIMULTANEOUSLB improves as p increases.
One of the main components is a new algorithm with improved regret for Online Linear Optimization (OLO) over the non-negative vectors in the lq ball. Interestingly, this OLO algorithm is also used to prove a purely probabilistic inequality that controls the correlations arising in the random-order model, a common source of difficulty for the analysis. Another important component used in both SIMULTANEOUSLB and our OLO algorithm is a smoothing of the lp-norm that may be of independent interest. This smoothness property allows us to see algorithm SIMULTANEOUSLB as essentially a greedy one in the worst-case model and as a primal-dual one in the random-order model, which is instrumental for its simultaneous guarantees.
△ Less
Submitted 28 October, 2016; v1 submitted 24 October, 2016;
originally announced October 2016.
-
How the Experts Algorithm Can Help Solve LPs Online
Authors:
Anupam Gupta,
Marco Molinaro
Abstract:
We consider the problem of solving packing/covering LPs online, when the columns of the constraint matrix are presented in random order. This problem has received much attention and the main focus is to figure out how large the right-hand sides of the LPs have to be (compared to the entries on the left-hand side of the constraints) to allow $(1+ε)$-approximations online. It is known that the right…
▽ More
We consider the problem of solving packing/covering LPs online, when the columns of the constraint matrix are presented in random order. This problem has received much attention and the main focus is to figure out how large the right-hand sides of the LPs have to be (compared to the entries on the left-hand side of the constraints) to allow $(1+ε)$-approximations online. It is known that the right-hand sides have to be $Ω(ε^{-2} \log m)$ times the left-hand sides, where $m$ is the number of constraints.
In this paper we give a primal-dual algorithm that achieve this bound for mixed packing/covering LPs. Our algorithms construct dual solutions using a regret-minimizing online learning algorithm in a black-box fashion, and use them to construct primal solutions. The adversarial guarantee that holds for the constructed duals helps us to take care of most of the correlations that arise in the algorithm; the remaining correlations are handled via martingale concentration and maximal inequalities. These ideas lead to conceptually simple and modular algorithms, which we hope will be useful in other contexts.
△ Less
Submitted 25 November, 2015; v1 submitted 20 July, 2014;
originally announced July 2014.
-
Mixed-integer Quadratic Programming is in NP
Authors:
Alberto Del Pia,
Santanu S. Dey,
Marco Molinaro
Abstract:
Mixed-integer quadratic programming is the problem of optimizing a quadratic function over points in a polyhedral set where some of the components are restricted to be integral. In this paper, we prove that the decision version of mixed-integer quadratic programming is in NP, thereby showing that it is NP-complete. This is established by showing that if the decision version of mixed-integer quadra…
▽ More
Mixed-integer quadratic programming is the problem of optimizing a quadratic function over points in a polyhedral set where some of the components are restricted to be integral. In this paper, we prove that the decision version of mixed-integer quadratic programming is in NP, thereby showing that it is NP-complete. This is established by showing that if the decision version of mixed-integer quadratic programming is feasible, then there exists a solution of polynomial size. This result generalizes and unifies classical results that quadratic programming is in NP and integer linear programming is in NP.
△ Less
Submitted 17 July, 2014;
originally announced July 2014.
-
Geometry of Online Packing Linear Programs
Authors:
Marco Molinaro,
R. Ravi
Abstract:
We consider packing LP's with $m$ rows where all constraint coefficients are normalized to be in the unit interval. The n columns arrive in random order and the goal is to set the corresponding decision variables irrevocably when they arrive so as to obtain a feasible solution maximizing the expected reward. Previous (1 - ε)-competitive algorithms require the right-hand side of the LP to be Omega(…
▽ More
We consider packing LP's with $m$ rows where all constraint coefficients are normalized to be in the unit interval. The n columns arrive in random order and the goal is to set the corresponding decision variables irrevocably when they arrive so as to obtain a feasible solution maximizing the expected reward. Previous (1 - ε)-competitive algorithms require the right-hand side of the LP to be Omega((m/ε^2) log (n/ε)), a bound that worsens with the number of columns and rows. However, the dependence on the number of columns is not required in the single-row case and known lower bounds for the general case are also independent of n.
Our goal is to understand whether the dependence on n is required in the multi-row case, making it fundamentally harder than the single-row version. We refute this by exhibiting an algorithm which is (1 - ε)-competitive as long as the right-hand sides are Omega((m^2/ε^2) log (m/ε)). Our techniques refine previous PAC-learning based approaches which interpret the online decisions as linear classifications of the columns based on sampled dual prices. The key ingredient of our improvement comes from a non-standard covering argument together with the realization that only when the columns of the LP belong to few 1-d subspaces we can obtain small such covers; bounding the size of the cover constructed also relies on the geometry of linear classifiers. General packing LP's are handled by perturbing the input columns, which can be seen as making the learning problem more robust.
△ Less
Submitted 25 April, 2012;
originally announced April 2012.
-
The Query-commit Problem
Authors:
Marco Molinaro,
R. Ravi
Abstract:
In the query-commit problem we are given a graph where edges have distinct probabilities of existing. It is possible to query the edges of the graph, and if the queried edge exists then its endpoints are irrevocably matched. The goal is to find a querying strategy which maximizes the expected size of the matching obtained. This stochastic matching setup is motivated by applications in kidney excha…
▽ More
In the query-commit problem we are given a graph where edges have distinct probabilities of existing. It is possible to query the edges of the graph, and if the queried edge exists then its endpoints are irrevocably matched. The goal is to find a querying strategy which maximizes the expected size of the matching obtained. This stochastic matching setup is motivated by applications in kidney exchanges and online dating.
In this paper we address the query-commit problem from both theoretical and experimental perspectives. First, we show that a simple class of edges can be queried without compromising the optimality of the strategy. This property is then used to obtain in polynomial time an optimal querying strategy when the input graph is sparse. Next we turn our attentions to the kidney exchange application, focusing on instances modeled over real data from existing exchange programs. We prove that, as the number of nodes grows, almost every instance admits a strategy which matches almost all nodes. This result supports the intuition that more exchanges are possible on a larger pool of patient/donors and gives theoretical justification for unifying the existing exchange programs. Finally, we evaluate experimentally different querying strategies over kidney exchange instances. We show that even very simple heuristics perform fairly well, being within 1.5% of an optimal clairvoyant strategy, that knows in advance the edges in the graph. In such a time-sensitive application, this result motivates the use of committing strategies.
△ Less
Submitted 5 October, 2011;
originally announced October 2011.
-
Approximation Algorithms for Correlated Knapsacks and Non-Martingale Bandits
Authors:
Anupam Gupta,
Ravishankar Krishnaswamy,
Marco Molinaro,
R. Ravi
Abstract:
In the stochastic knapsack problem, we are given a knapsack of size B, and a set of jobs whose sizes and rewards are drawn from a known probability distribution. However, we know the actual size and reward only when the job completes. How should we schedule jobs to maximize the expected total reward? We know O(1)-approximations when we assume that (i) rewards and sizes are independent random varia…
▽ More
In the stochastic knapsack problem, we are given a knapsack of size B, and a set of jobs whose sizes and rewards are drawn from a known probability distribution. However, we know the actual size and reward only when the job completes. How should we schedule jobs to maximize the expected total reward? We know O(1)-approximations when we assume that (i) rewards and sizes are independent random variables, and (ii) we cannot prematurely cancel jobs. What can we say when either or both of these assumptions are changed?
The stochastic knapsack problem is of interest in its own right, but techniques developed for it are applicable to other stochastic packing problems. Indeed, ideas for this problem have been useful for budgeted learning problems, where one is given several arms which evolve in a specified stochastic fashion with each pull, and the goal is to pull the arms a total of B times to maximize the reward obtained. Much recent work on this problem focus on the case when the evolution of the arms follows a martingale, i.e., when the expected reward from the future is the same as the reward at the current state. What can we say when the rewards do not form a martingale?
In this paper, we give constant-factor approximation algorithms for the stochastic knapsack problem with correlations and/or cancellations, and also for budgeted learning problems where the martingale condition is not satisfied. Indeed, we can show that previously proposed LP relaxations have large integrality gaps. We propose new time-indexed LP relaxations, and convert the fractional solutions into distributions over strategies, and then use the LP values and the time ordering information from these strategies to devise a randomized adaptive scheduling algorithm. We hope our LP formulation and decomposition methods may provide a new way to address other correlated bandit problems with more general contexts.
△ Less
Submitted 17 February, 2011;
originally announced February 2011.
-
Capacitated Vehicle Routing with Non-Uniform Speeds
Authors:
Inge Li Gortz,
Marco Molinaro,
Viswanath Nagarajan,
R. Ravi
Abstract:
The capacitated vehicle routing problem (CVRP) involves distributing (identical) items from a depot to a set of demand locations, using a single capacitated vehicle. We study a generalization of this problem to the setting of multiple vehicles having non-uniform speeds (that we call Heterogenous CVRP), and present a constant-factor approximation algorithm.
The technical heart of our result lies…
▽ More
The capacitated vehicle routing problem (CVRP) involves distributing (identical) items from a depot to a set of demand locations, using a single capacitated vehicle. We study a generalization of this problem to the setting of multiple vehicles having non-uniform speeds (that we call Heterogenous CVRP), and present a constant-factor approximation algorithm.
The technical heart of our result lies in achieving a constant approximation to the following TSP variant (called Heterogenous TSP). Given a metric denoting distances between vertices, a depot r containing k vehicles with possibly different speeds, the goal is to find a tour for each vehicle (starting and ending at r), so that every vertex is covered in some tour and the maximum completion time is minimized. This problem is precisely Heterogenous CVRP when vehicles are uncapacitated.
The presence of non-uniform speeds introduces difficulties for employing standard tour-splitting techniques. In order to get a better understanding of this technique in our context, we appeal to ideas from the 2-approximation for scheduling in parallel machine of Lenstra et al.. This motivates the introduction of a new approximate MST construction called Level-Prim, which is related to Light Approximate Shortest-path Trees. The last component of our algorithm involves partitioning the Level-Prim tree and matching the resulting parts to vehicles. This decomposition is more subtle than usual since now we need to enforce correlation between the size of the parts and their distances to the depot.
△ Less
Submitted 8 December, 2010;
originally announced December 2010.
-
On the Complexity of Searching in Trees: Average-case Minimization
Authors:
Ferdinando Cicalese,
Tobias Jacobs,
Eduardo Laber,
Marco Molinaro
Abstract:
We focus on the average-case analysis: A function w : V -> Z+ is given which defines the likelihood for a node to be the one marked, and we want the strategy that minimizes the expected number of queries. Prior to this paper, very little was known about this natural question and the complexity of the problem had remained so far an open question.
We close this question and prove that the above…
▽ More
We focus on the average-case analysis: A function w : V -> Z+ is given which defines the likelihood for a node to be the one marked, and we want the strategy that minimizes the expected number of queries. Prior to this paper, very little was known about this natural question and the complexity of the problem had remained so far an open question.
We close this question and prove that the above tree search problem is NP-complete even for the class of trees with diameter at most 4. This results in a complete characterization of the complexity of the problem with respect to the diameter size. In fact, for diameter not larger than 3 the problem can be shown to be polynomially solvable using a dynamic programming approach.
In addition we prove that the problem is NP-complete even for the class of trees of maximum degree at most 16. To the best of our knowledge, the only known result in this direction is that the tree search problem is solvable in O(|V| log|V|) time for trees with degree at most 2 (paths).
We match the above complexity results with a tight algorithmic analysis. We first show that a natural greedy algorithm attains a 2-approximation. Furthermore, for the bounded degree instances, we show that any optimal strategy (i.e., one that minimizes the expected number of queries) performs at most O(Δ(T) (log |V| + log w(T))) queries in the worst case, where w(T) is the sum of the likelihoods of the nodes of T and Δ(T) is the maximum degree of T. We combine this result with a non-trivial exponential time algorithm to provide an FPTAS for trees with bounded degree.
△ Less
Submitted 9 August, 2009; v1 submitted 22 April, 2009;
originally announced April 2009.