-
A Geometric Unification of Distributionally Robust Covariance Estimators: Shrinking the Spectrum by Inflating the Ambiguity Set
Authors:
Man-Chung Yue,
Yves Rychener,
Daniel Kuhn,
Viet Anh Nguyen
Abstract:
The state-of-the-art methods for estimating high-dimensional covariance matrices all shrink the eigenvalues of the sample covariance matrix towards a data-insensitive shrinkage target. The underlying shrinkage transformation is either chosen heuristically - without compelling theoretical justification - or optimally in view of restrictive distributional assumptions. In this paper, we propose a pri…
▽ More
The state-of-the-art methods for estimating high-dimensional covariance matrices all shrink the eigenvalues of the sample covariance matrix towards a data-insensitive shrinkage target. The underlying shrinkage transformation is either chosen heuristically - without compelling theoretical justification - or optimally in view of restrictive distributional assumptions. In this paper, we propose a principled approach to construct covariance estimators without imposing restrictive assumptions. That is, we study distributionally robust covariance estimation problems that minimize the worst-case Frobenius error with respect to all data distributions close to a nominal distribution, where the proximity of distributions is measured via a divergence on the space of covariance matrices. We identify mild conditions on this divergence under which the resulting minimizers represent shrinkage estimators. We show that the corresponding shrinkage transformations are intimately related to the geometrical properties of the underlying divergence. We also prove that our robust estimators are efficiently computable and asymptotically consistent and that they enjoy finite-sample performance guarantees. We exemplify our general methodology by synthesizing explicit estimators induced by the Kullback-Leibler, Fisher-Rao, and Wasserstein divergences. Numerical experiments based on synthetic and real data show that our robust estimators are competitive with state-of-the-art estimators.
△ Less
Submitted 30 May, 2024;
originally announced May 2024.
-
A Large Deviations Perspective on Policy Gradient Algorithms
Authors:
Wouter Jongeneel,
Daniel Kuhn,
Mengmeng Li
Abstract:
Motivated by policy gradient methods in the context of reinforcement learning, we identify a large deviation rate function for the iterates generated by stochastic gradient descent for possibly non-convex objectives satisfying a Polyak-Łojasiewicz condition. Leveraging the contraction principle from large deviations theory, we illustrate the potential of this result by showing how convergence prop…
▽ More
Motivated by policy gradient methods in the context of reinforcement learning, we identify a large deviation rate function for the iterates generated by stochastic gradient descent for possibly non-convex objectives satisfying a Polyak-Łojasiewicz condition. Leveraging the contraction principle from large deviations theory, we illustrate the potential of this result by showing how convergence properties of policy gradient with a softmax parametrization and an entropy regularized objective can be naturally extended to a wide spectrum of other policy parametrizations.
△ Less
Submitted 3 June, 2024; v1 submitted 13 November, 2023;
originally announced November 2023.
-
Contextual Stochastic Bilevel Optimization
Authors:
Yifan Hu,
Jie Wang,
Yao Xie,
Andreas Krause,
Daniel Kuhn
Abstract:
We introduce contextual stochastic bilevel optimization (CSBO) -- a stochastic bilevel optimization framework with the lower-level problem minimizing an expectation conditioned on some contextual information and the upper-level decision variable. This framework extends classical stochastic bilevel optimization when the lower-level decision maker responds optimally not only to the decision of the u…
▽ More
We introduce contextual stochastic bilevel optimization (CSBO) -- a stochastic bilevel optimization framework with the lower-level problem minimizing an expectation conditioned on some contextual information and the upper-level decision variable. This framework extends classical stochastic bilevel optimization when the lower-level decision maker responds optimally not only to the decision of the upper-level decision maker but also to some side information and when there are multiple or even infinite many followers. It captures important applications such as meta-learning, personalized federated learning, end-to-end learning, and Wasserstein distributionally robust optimization with side information (WDRO-SI). Due to the presence of contextual information, existing single-loop methods for classical stochastic bilevel optimization are unable to converge. To overcome this challenge, we introduce an efficient double-loop gradient method based on the Multilevel Monte-Carlo (MLMC) technique and establish its sample and computational complexities. When specialized to stochastic nonconvex optimization, our method matches existing lower bounds. For meta-learning, the complexity of our method does not depend on the number of tasks. Numerical experiments further validate our theoretical results.
△ Less
Submitted 27 October, 2023;
originally announced October 2023.
-
Unifying Distributionally Robust Optimization via Optimal Transport Theory
Authors:
Jose Blanchet,
Daniel Kuhn,
Jia** Li,
Bahar Taskesen
Abstract:
In the past few years, there has been considerable interest in two prominent approaches for Distributionally Robust Optimization (DRO): Divergence-based and Wasserstein-based methods. The divergence approach models misspecification in terms of likelihood ratios, while the latter models it through a measure of distance or cost in actual outcomes. Building upon these advances, this paper introduces…
▽ More
In the past few years, there has been considerable interest in two prominent approaches for Distributionally Robust Optimization (DRO): Divergence-based and Wasserstein-based methods. The divergence approach models misspecification in terms of likelihood ratios, while the latter models it through a measure of distance or cost in actual outcomes. Building upon these advances, this paper introduces a novel approach that unifies these methods into a single framework based on optimal transport (OT) with conditional moment constraints. Our proposed approach, for example, makes it possible for optimal adversarial distributions to simultaneously perturb likelihood and outcomes, while producing an optimal (in an optimal transport sense) coupling between the baseline model and the adversarial model.Additionally, the paper investigates several duality results and presents tractable reformulations that enhance the practical applicability of this unified framework.
△ Less
Submitted 10 August, 2023;
originally announced August 2023.
-
End-to-End Learning for Stochastic Optimization: A Bayesian Perspective
Authors:
Yves Rychener,
Daniel Kuhn,
Tobias Sutter
Abstract:
We develop a principled approach to end-to-end learning in stochastic optimization. First, we show that the standard end-to-end learning algorithm admits a Bayesian interpretation and trains a posterior Bayes action map. Building on the insights of this analysis, we then propose new end-to-end learning algorithms for training decision maps that output solutions of empirical risk minimization and d…
▽ More
We develop a principled approach to end-to-end learning in stochastic optimization. First, we show that the standard end-to-end learning algorithm admits a Bayesian interpretation and trains a posterior Bayes action map. Building on the insights of this analysis, we then propose new end-to-end learning algorithms for training decision maps that output solutions of empirical risk minimization and distributionally robust optimization problems, two dominant modeling paradigms in optimization under uncertainty. Numerical results for a synthetic newsvendor problem illustrate the key differences between alternative training schemes. We also investigate an economic dispatch problem based on real data to showcase the impact of the neural network architecture of the decision maps on their test performance.
△ Less
Submitted 11 June, 2023; v1 submitted 7 June, 2023;
originally announced June 2023.
-
Frequency Regulation with Storage: On Losses and Profits
Authors:
Dirk Lauinger,
François Vuille,
Daniel Kuhn
Abstract:
Low-carbon societies will need to store vast amounts of electricity to balance intermittent generation from wind and solar energy, for example, through frequency regulation. Here, we derive an analytical solution to the decision-making problem of storage operators who sell frequency regulation power to grid operators and trade electricity on day-ahead markets. Mathematically, we treat future frequ…
▽ More
Low-carbon societies will need to store vast amounts of electricity to balance intermittent generation from wind and solar energy, for example, through frequency regulation. Here, we derive an analytical solution to the decision-making problem of storage operators who sell frequency regulation power to grid operators and trade electricity on day-ahead markets. Mathematically, we treat future frequency deviation trajectories as functional uncertainties in a receding horizon robust optimization problem. We constrain the expected terminal state-of-charge to be equal to some target to allow storage operators to make good decisions not only for the present but also the future. Thanks to this constraint, the amount of electricity traded on day-ahead markets is an implicit function of the regulation power sold to grid operators. The implicit function quantifies the amount of power that needs to be purchased to cover the expected energy loss that results from providing frequency regulation. We show how the marginal cost associated with the expected energy loss decreases with roundtrip efficiency and increases with frequency deviation dispersion. We find that the profits from frequency regulation over the lifetime of energy-constrained storage devices are roughly inversely proportional to the length of time for which regulation power must be committed.
△ Less
Submitted 26 March, 2024; v1 submitted 5 June, 2023;
originally announced June 2023.
-
Policy Gradient Algorithms for Robust MDPs with Non-Rectangular Uncertainty Sets
Authors:
Mengmeng Li,
Daniel Kuhn,
Tobias Sutter
Abstract:
We propose policy gradient algorithms for robust infinite-horizon Markov decision processes (MDPs) with non-rectangular uncertainty sets, thereby addressing an open challenge in the robust MDP literature. Indeed, uncertainty sets that display statistical optimality properties and make optimal use of limited data often fail to be rectangular. Unfortunately, the corresponding robust MDPs cannot be s…
▽ More
We propose policy gradient algorithms for robust infinite-horizon Markov decision processes (MDPs) with non-rectangular uncertainty sets, thereby addressing an open challenge in the robust MDP literature. Indeed, uncertainty sets that display statistical optimality properties and make optimal use of limited data often fail to be rectangular. Unfortunately, the corresponding robust MDPs cannot be solved with dynamic programming techniques and are in fact provably intractable. We first present a randomized projected Langevin dynamics algorithm that solves the robust policy evaluation problem to global optimality but is inefficient. We also propose a deterministic policy gradient method that is efficient but solves the robust policy evaluation problem only approximately, and we prove that the approximation error scales with a new measure of non-rectangularity of the uncertainty set. Finally, we describe an actor-critic algorithm that finds an $ε$-optimal solution for the robust policy improvement problem in $\mathcal{O}(1/ε^4)$ iterations. We thus present the first complete solution scheme for robust MDPs with non-rectangular uncertainty sets offering global optimality guarantees. Numerical experiments show that our algorithms compare favorably against state-of-the-art methods.
△ Less
Submitted 23 January, 2024; v1 submitted 30 May, 2023;
originally announced May 2023.
-
Distributionally Robust Linear Quadratic Control
Authors:
Bahar Taşkesen,
Dan A. Iancu,
Çağıl Koçyiğit,
Daniel Kuhn
Abstract:
Linear-Quadratic-Gaussian (LQG) control is a fundamental control paradigm that is studied in various fields such as engineering, computer science, economics, and neuroscience. It involves controlling a system with linear dynamics and imperfect observations, subject to additive noise, with the goal of minimizing a quadratic cost function for the state and control variables. In this work, we conside…
▽ More
Linear-Quadratic-Gaussian (LQG) control is a fundamental control paradigm that is studied in various fields such as engineering, computer science, economics, and neuroscience. It involves controlling a system with linear dynamics and imperfect observations, subject to additive noise, with the goal of minimizing a quadratic cost function for the state and control variables. In this work, we consider a generalization of the discrete-time, finite-horizon LQG problem, where the noise distributions are unknown and belong to Wasserstein ambiguity sets centered at nominal (Gaussian) distributions. The objective is to minimize a worst-case cost across all distributions in the ambiguity set, including non-Gaussian distributions. Despite the added complexity, we prove that a control policy that is linear in the observations is optimal for this problem, as in the classic LQG problem. We propose a numerical solution method that efficiently characterizes this optimal control policy. Our method uses the Frank-Wolfe algorithm to identify the least-favorable distributions within the Wasserstein ambiguity sets and computes the controller's optimal policy using Kalman filter estimation under these distributions.
△ Less
Submitted 1 November, 2023; v1 submitted 26 May, 2023;
originally announced May 2023.
-
PIQP: A Proximal Interior-Point Quadratic Programming Solver
Authors:
Roland Schwan,
Yuning Jiang,
Daniel Kuhn,
Colin N. Jones
Abstract:
This paper presents PIQP, a high-performance toolkit for solving generic sparse quadratic programs (QP). Combining an infeasible Interior Point Method (IPM) with the Proximal Method of Multipliers (PMM), the algorithm can handle ill-conditioned convex QP problems without the need for linear independence of the constraints. The open-source implementation is written in C++ with interfaces to C, Pyth…
▽ More
This paper presents PIQP, a high-performance toolkit for solving generic sparse quadratic programs (QP). Combining an infeasible Interior Point Method (IPM) with the Proximal Method of Multipliers (PMM), the algorithm can handle ill-conditioned convex QP problems without the need for linear independence of the constraints. The open-source implementation is written in C++ with interfaces to C, Python, Matlab, and R leveraging the Eigen3 library. The method uses a pivoting-free factorization routine and allocation-free updates of the problem data, making the solver suitable for embedded applications. The solver is evaluated on the Maros-Mészáros problem set and optimal control problems, demonstrating state-of-the-art performance for both small and large-scale problems, outperforming commercial and open-source solvers.
△ Less
Submitted 15 September, 2023; v1 submitted 1 April, 2023;
originally announced April 2023.
-
New Perspectives on Regularization and Computation in Optimal Transport-Based Distributionally Robust Optimization
Authors:
Soroosh Shafieezadeh-Abadeh,
Liviu Aolaritei,
Florian Dörfler,
Daniel Kuhn
Abstract:
We study optimal transport-based distributionally robust optimization problems where a fictitious adversary, often envisioned as nature, can choose the distribution of the uncertain problem parameters by resha** a prescribed reference distribution at a finite transportation cost. In this framework, we show that robustification is intimately related to various forms of variation and Lipschitz reg…
▽ More
We study optimal transport-based distributionally robust optimization problems where a fictitious adversary, often envisioned as nature, can choose the distribution of the uncertain problem parameters by resha** a prescribed reference distribution at a finite transportation cost. In this framework, we show that robustification is intimately related to various forms of variation and Lipschitz regularization even if the transportation cost function fails to be (some power of) a metric. We also derive conditions for the existence and the computability of a Nash equilibrium between the decision-maker and nature, and we demonstrate numerically that nature's Nash strategy can be viewed as a distribution that is supported on remarkably deceptive adversarial samples. Finally, we identify practically relevant classes of optimal transport-based distributionally robust optimization problems that can be addressed with efficient gradient descent algorithms even if the loss function or the transportation cost function are nonconvex (but not both at the same time).
△ Less
Submitted 7 March, 2023;
originally announced March 2023.
-
Perfect matchings in random sparsifications of Dirac hypergraphs
Authors:
Dong Yeap Kang,
Tom Kelly,
Daniela Kühn,
Deryk Osthus,
Vincent Pfenninger
Abstract:
For all integers $n \geq k > d \geq 1$, let $m_{d}(k,n)$ be the minimum integer $D \geq 0$ such that every $k$-uniform $n$-vertex hypergraph $\mathcal H$ with minimum $d$-degree $δ_{d}(\mathcal H)$ at least $D$ has an optimal matching. For every fixed integer $k \geq 3$, we show that for $n \in k \mathbb{N}$ and $p = Ω(n^{-k+1} \log n)$, if $\mathcal H$ is an $n$-vertex $k$-uniform hypergraph with…
▽ More
For all integers $n \geq k > d \geq 1$, let $m_{d}(k,n)$ be the minimum integer $D \geq 0$ such that every $k$-uniform $n$-vertex hypergraph $\mathcal H$ with minimum $d$-degree $δ_{d}(\mathcal H)$ at least $D$ has an optimal matching. For every fixed integer $k \geq 3$, we show that for $n \in k \mathbb{N}$ and $p = Ω(n^{-k+1} \log n)$, if $\mathcal H$ is an $n$-vertex $k$-uniform hypergraph with $δ_{k-1}(\mathcal H) \geq m_{k-1}(k,n)$, then a.a.s.\ its $p$-random subhypergraph $\mathcal H_p$ contains a perfect matching. Moreover, for every fixed integer $d < k$ and $γ> 0$, we show that the same conclusion holds if $\mathcal H$ is an $n$-vertex $k$-uniform hypergraph with $δ_d(\mathcal H) \geq m_{d}(k,n) + γ\binom{n - d}{k - d}$. Both of these results strengthen Johansson, Kahn, and Vu's seminal solution to Shamir's problem and can be viewed as ``robust'' versions of hypergraph Dirac-type results. In addition, we also show that in both cases above, $\mathcal H$ has at least $\exp((1-1/k)n \log n - Θ(n))$ many perfect matchings, which is best possible up to an $\exp(Θ(n))$ factor.
△ Less
Submitted 16 April, 2024; v1 submitted 2 November, 2022;
originally announced November 2022.
-
Thresholds for Latin squares and Steiner triple systems: Bounds within a logarithmic factor
Authors:
Dong Yeap Kang,
Tom Kelly,
Daniela Kühn,
Abhishek Methuku,
Deryk Osthus
Abstract:
We prove that for $n \in \mathbb N$ and an absolute constant $C$, if $p \geq C\log^2 n / n$ and $L_{i,j} \subseteq [n]$ is a random subset of $[n]$ where each $k\in [n]$ is included in $L_{i,j}$ independently with probability $p$ for each $i, j\in [n]$, then asymptotically almost surely there is an order-$n$ Latin square in which the entry in the $i$th row and $j$th column lies in $L_{i,j}$. The p…
▽ More
We prove that for $n \in \mathbb N$ and an absolute constant $C$, if $p \geq C\log^2 n / n$ and $L_{i,j} \subseteq [n]$ is a random subset of $[n]$ where each $k\in [n]$ is included in $L_{i,j}$ independently with probability $p$ for each $i, j\in [n]$, then asymptotically almost surely there is an order-$n$ Latin square in which the entry in the $i$th row and $j$th column lies in $L_{i,j}$. The problem of determining the threshold probability for the existence of an order-$n$ Latin square was raised independently by Johansson, by Luria and Simkin, and by Casselgren and H{ä}ggkvist; our result provides an upper bound which is tight up to a factor of $\log n$ and strengthens the bound recently obtained by Sah, Sawhney, and Simkin. We also prove analogous results for Steiner triple systems and $1$-factorizations of complete graphs, and moreover, we show that each of these thresholds is at most the threshold for the existence of a $1$-factorization of a nearly complete regular bipartite graph.
△ Less
Submitted 26 March, 2023; v1 submitted 29 June, 2022;
originally announced June 2022.
-
Stability Verification of Neural Network Controllers using Mixed-Integer Programming
Authors:
Roland Schwan,
Colin N. Jones,
Daniel Kuhn
Abstract:
We propose a framework for the stability verification of Mixed-Integer Linear Programming (MILP) representable control policies. This framework compares a fixed candidate policy, which admits an efficient parameterization and can be evaluated at a low computational cost, against a fixed baseline policy, which is known to be stable but expensive to evaluate. We provide sufficient conditions for the…
▽ More
We propose a framework for the stability verification of Mixed-Integer Linear Programming (MILP) representable control policies. This framework compares a fixed candidate policy, which admits an efficient parameterization and can be evaluated at a low computational cost, against a fixed baseline policy, which is known to be stable but expensive to evaluate. We provide sufficient conditions for the closed-loop stability of the candidate policy in terms of the worst-case approximation error with respect to the baseline policy, and we show that these conditions can be checked by solving a Mixed-Integer Quadratic Program (MIQP). Additionally, we demonstrate that an outer and inner approximation of the stability region of the candidate policy can be computed by solving an MILP. The proposed framework is sufficiently general to accommodate a broad range of candidate policies including ReLU Neural Networks (NNs), optimal solution maps of parametric quadratic programs, and Model Predictive Control (MPC) policies. We also present an open-source toolbox in Python based on the proposed framework, which allows for the easy verification of custom NN architectures and MPC formulations. We showcase the flexibility and reliability of our framework in the context of a DC-DC power converter case study and investigate its computational complexity.
△ Less
Submitted 31 May, 2023; v1 submitted 27 June, 2022;
originally announced June 2022.
-
On Approximations of Data-Driven Chance Constrained Programs over Wasserstein Balls
Authors:
Zhi Chen,
Daniel Kuhn,
Wolfram Wiesemann
Abstract:
Distributionally robust chance constrained programs minimize a deterministic cost function subject to the satisfaction of one or more safety conditions with high probability, given that the probability distribution of the uncertain problem parameters affecting the safety condition(s) is only known to belong to some ambiguity set. We study three popular approximation schemes for distributionally ro…
▽ More
Distributionally robust chance constrained programs minimize a deterministic cost function subject to the satisfaction of one or more safety conditions with high probability, given that the probability distribution of the uncertain problem parameters affecting the safety condition(s) is only known to belong to some ambiguity set. We study three popular approximation schemes for distributionally robust chance constrained programs over Wasserstein balls, where the ambiguity set contains all probability distributions within a certain Wasserstein distance to a reference distribution. The first approximation replaces the chance constraint with a bound on the conditional value-at-risk, the second approximation decouples different safety conditions via Bonferroni's inequality, and the third approximation restricts the expected violation of the safety condition(s) so that the chance constraint is satisfied. We show that the conditional value-at-risk approximation can be characterized as a tight convex approximation, which complements earlier findings on classical (non-robust) chance constraints, and we offer a novel interpretation in terms of transportation savings. We also show that the three approximations can perform arbitrarily poorly in data-driven settings, and that they are generally incomparable with each other.
△ Less
Submitted 20 November, 2022; v1 submitted 1 June, 2022;
originally announced June 2022.
-
Metrizing Fairness
Authors:
Yves Rychener,
Bahar Taskesen,
Daniel Kuhn
Abstract:
We study supervised learning problems that have significant effects on individuals from two demographic groups, and we seek predictors that are fair with respect to a group fairness criterion such as statistical parity (SP). A predictor is SP-fair if the distributions of predictions within the two groups are close in Kolmogorov distance, and fairness is achieved by penalizing the dissimilarity of…
▽ More
We study supervised learning problems that have significant effects on individuals from two demographic groups, and we seek predictors that are fair with respect to a group fairness criterion such as statistical parity (SP). A predictor is SP-fair if the distributions of predictions within the two groups are close in Kolmogorov distance, and fairness is achieved by penalizing the dissimilarity of these two distributions in the objective function of the learning problem. In this paper, we identify conditions under which hard SP constraints are guaranteed to improve predictive accuracy. We also showcase conceptual and computational benefits of measuring unfairness with integral probability metrics (IPMs) other than the Kolmogorov distance. Conceptually, we show that the generator of any IPM can be interpreted as a family of utility functions and that unfairness with respect to this IPM arises if individuals in the two demographic groups have diverging expected utilities. We also prove that the unfairness-regularized prediction loss admits unbiased gradient estimators, which are constructed from random mini-batches of training samples, if unfairness is measured by the squared $\mathcal L^2$-distance or by a squared maximum mean discrepancy. In this case, the fair learning problem is susceptible to efficient stochastic gradient descent (SGD) algorithms. Numerical experiments on synthetic and real data show that these SGD algorithms outperform state-of-the-art methods for fair learning in that they achieve superior accuracy-unfairness trade-offs -- sometimes orders of magnitude faster.
△ Less
Submitted 11 June, 2024; v1 submitted 30 May, 2022;
originally announced May 2022.
-
Discrete Optimal Transport with Independent Marginals is #P-Hard
Authors:
Bahar Taşkesen,
Soroosh Shafieezadeh-Abadeh,
Daniel Kuhn,
Karthik Natarajan
Abstract:
We study the computational complexity of the optimal transport problem that evaluates the Wasserstein distance between the distributions of two K-dimensional discrete random vectors. The best known algorithms for this problem run in polynomial time in the maximum of the number of atoms of the two distributions. However, if the components of either random vector are independent, then this number ca…
▽ More
We study the computational complexity of the optimal transport problem that evaluates the Wasserstein distance between the distributions of two K-dimensional discrete random vectors. The best known algorithms for this problem run in polynomial time in the maximum of the number of atoms of the two distributions. However, if the components of either random vector are independent, then this number can be exponential in K even though the size of the problem description scales linearly with K. We prove that the described optimal transport problem is #P-hard even if all components of the first random vector are independent uniform Bernoulli random variables, while the second random vector has merely two atoms, and even if only approximate solutions are sought. We also develop a dynamic programming-type algorithm that approximates the Wasserstein distance in pseudo-polynomial time when the components of the first random vector follow arbitrary independent discrete distributions, and we identify special problem instances that can be solved exactly in strongly polynomial time.
△ Less
Submitted 14 October, 2022; v1 submitted 2 March, 2022;
originally announced March 2022.
-
Mean-Covariance Robust Risk Measurement
Authors:
Viet Anh Nguyen,
Soroosh Shafiee,
Damir Filipović,
Daniel Kuhn
Abstract:
We introduce a universal framework for mean-covariance robust risk measurement and portfolio optimization. We model uncertainty in terms of the Gelbrich distance on the mean-covariance space, along with prior structural information about the population distribution. Our approach is related to the theory of optimal transport and exhibits superior statistical and computational properties than existi…
▽ More
We introduce a universal framework for mean-covariance robust risk measurement and portfolio optimization. We model uncertainty in terms of the Gelbrich distance on the mean-covariance space, along with prior structural information about the population distribution. Our approach is related to the theory of optimal transport and exhibits superior statistical and computational properties than existing models. We find that, for a large class of risk measures, mean-covariance robust portfolio optimization boils down to the Markowitz model, subject to a regularization term given in closed form. This includes the finance standards, value-at-risk and conditional value-at-risk, and can be solved highly efficiently.
△ Less
Submitted 30 November, 2023; v1 submitted 18 December, 2021;
originally announced December 2021.
-
Solution to a problem of Erdős on the chromatic index of hypergraphs with bounded codegree
Authors:
Dong Yeap Kang,
Tom Kelly,
Daniela Kühn,
Abhishek Methuku,
Deryk Osthus
Abstract:
In 1977, Erdős asked the following question: for any integers $t,n \in \mathbb{N}$, if $G_1 , \dots , G_n$ are complete graphs such that each $G_i$ has at most $n$ vertices and every pair of them shares at most $t$ vertices, what is the largest possible chromatic number of the union $\bigcup_{i=1}^{n} G_i$? The equivalent dual formulation of this question asks for the largest chromatic index of an…
▽ More
In 1977, Erdős asked the following question: for any integers $t,n \in \mathbb{N}$, if $G_1 , \dots , G_n$ are complete graphs such that each $G_i$ has at most $n$ vertices and every pair of them shares at most $t$ vertices, what is the largest possible chromatic number of the union $\bigcup_{i=1}^{n} G_i$? The equivalent dual formulation of this question asks for the largest chromatic index of an $n$-vertex hypergraph with maximum degree at most $n$ and codegree at most $t$. For the case $t = 1$, Erdős, Faber, and Lovász famously conjectured that the answer is $n$, which was recently proved by the authors for all sufficiently large $n$. In this paper, we answer this question of Erdős for $t \geq 2$ in a strong sense, by proving that every $n$-vertex hypergraph with maximum degree at most $(1-o(1))tn$ and codegree at most $t$ has chromatic index at most $tn$ for any $t,n \in \mathbb{N}$. Moreover, equality holds if and only if the hypergraph is a $t$-fold projective plane of order $k$, where $n = k^2 + k + 1$. Thus, for every $t \in \mathbb N$, this bound is best possible for infinitely many integers $n$. This result also holds for the list chromatic index.
△ Less
Submitted 12 October, 2021;
originally announced October 2021.
-
Hypergraph regularity and random sampling
Authors:
Felix Joos,
Jaehoon Kim,
Daniela Kühn,
Deryk Osthus
Abstract:
Suppose a $k$-uniform hypergraph $H$ that satisfies a certain regularity instance (that is, there is a partition of $H$ given by the hypergraph regularity lemma into a bounded number of quasirandom subhypergraphs of prescribed densities). We prove that with high probability a large enough uniform random sample of the vertex set of $H$ also admits the same regularity instance. Here the crucial feat…
▽ More
Suppose a $k$-uniform hypergraph $H$ that satisfies a certain regularity instance (that is, there is a partition of $H$ given by the hypergraph regularity lemma into a bounded number of quasirandom subhypergraphs of prescribed densities). We prove that with high probability a large enough uniform random sample of the vertex set of $H$ also admits the same regularity instance. Here the crucial feature is that the error term measuring the quasirandomness of the subhypergraphs requires only an arbitrarily small additive correction. This has applications to combinatorial property testing. The graph case of the sampling result was proved by Alon, Fischer, Newman and Shapira.
△ Less
Submitted 11 August, 2022; v1 submitted 4 October, 2021;
originally announced October 2021.
-
A special case of Vu's conjecture: Coloring nearly disjoint graphs of bounded maximum degree
Authors:
Tom Kelly,
Daniela Kühn,
Deryk Osthus
Abstract:
A collection of graphs is \textit{nearly disjoint} if every pair of them intersects in at most one vertex. We prove that if $G_1, \dots, G_m$ are nearly disjoint graphs of maximum degree at most $D$, then the following holds. For every fixed $C$, if each vertex $v \in \bigcup_{i=1}^m V(G_i)$ is contained in at most $C$ of the graphs $G_1, \dots, G_m$, then the (list) chromatic number of…
▽ More
A collection of graphs is \textit{nearly disjoint} if every pair of them intersects in at most one vertex. We prove that if $G_1, \dots, G_m$ are nearly disjoint graphs of maximum degree at most $D$, then the following holds. For every fixed $C$, if each vertex $v \in \bigcup_{i=1}^m V(G_i)$ is contained in at most $C$ of the graphs $G_1, \dots, G_m$, then the (list) chromatic number of $\bigcup_{i=1}^m G_i$ is at most $D + o(D)$. This result confirms a special case of a conjecture of Vu and generalizes Kahn's bound on the list chromatic index of linear uniform hypergraphs of bounded maximum degree. In fact, this result holds for the correspondence (or DP) chromatic number and thus implies a recent result of Molloy, and we derive this result from a more general list coloring result in the setting of `color degrees' that also implies a result of Reed and Sudakov.
△ Less
Submitted 28 October, 2023; v1 submitted 23 September, 2021;
originally announced September 2021.
-
Graph and hypergraph colouring via nibble methods: A survey
Authors:
Dong Yeap Kang,
Tom Kelly,
Daniela Kühn,
Abhishek Methuku,
Deryk Osthus
Abstract:
This paper provides a survey of methods, results, and open problems on graph and hypergraph colourings, with a particular emphasis on semi-random `nibble' methods. We also give a detailed sketch of some aspects of the recent proof of the Erdős-Faber-Lovász conjecture.
This paper provides a survey of methods, results, and open problems on graph and hypergraph colourings, with a particular emphasis on semi-random `nibble' methods. We also give a detailed sketch of some aspects of the recent proof of the Erdős-Faber-Lovász conjecture.
△ Less
Submitted 16 November, 2021; v1 submitted 25 June, 2021;
originally announced June 2021.
-
Distributionally Robust Optimization with Markovian Data
Authors:
Mengmeng Li,
Tobias Sutter,
Daniel Kuhn
Abstract:
We study a stochastic program where the probability distribution of the uncertain problem parameters is unknown and only indirectly observed via finitely many correlated samples generated by an unknown Markov chain with $d$ states. We propose a data-driven distributionally robust optimization model to estimate the problem's objective function and optimal solution. By leveraging results from large…
▽ More
We study a stochastic program where the probability distribution of the uncertain problem parameters is unknown and only indirectly observed via finitely many correlated samples generated by an unknown Markov chain with $d$ states. We propose a data-driven distributionally robust optimization model to estimate the problem's objective function and optimal solution. By leveraging results from large deviations theory, we derive statistical guarantees on the quality of these estimators. The underlying worst-case expectation problem is nonconvex and involves $\mathcal O(d^2)$ decision variables. Thus, it cannot be solved efficiently for large $d$. By exploiting the structure of this problem, we devise a customized Frank-Wolfe algorithm with convex direction-finding subproblems of size $\mathcal O(d)$. We prove that this algorithm finds a stationary point efficiently under mild conditions. The efficiency of the method is predicated on a dimensionality reduction enabled by a dual reformulation. Numerical experiments indicate that our approach has better computational and statistical properties than the state-of-the-art methods.
△ Less
Submitted 12 June, 2021;
originally announced June 2021.
-
Robust Generalization despite Distribution Shift via Minimum Discriminating Information
Authors:
Tobias Sutter,
Andreas Krause,
Daniel Kuhn
Abstract:
Training models that perform well under distribution shifts is a central challenge in machine learning. In this paper, we introduce a modeling framework where, in addition to training data, we have partial structural knowledge of the shifted test distribution. We employ the principle of minimum discriminating information to embed the available prior knowledge, and use distributionally robust optim…
▽ More
Training models that perform well under distribution shifts is a central challenge in machine learning. In this paper, we introduce a modeling framework where, in addition to training data, we have partial structural knowledge of the shifted test distribution. We employ the principle of minimum discriminating information to embed the available prior knowledge, and use distributionally robust optimization to account for uncertainty due to the limited samples. By leveraging large deviation results, we obtain explicit generalization bounds with respect to the unknown shifted distribution. Lastly, we demonstrate the versatility of our framework by demonstrating it on two rather distinct applications: (1) training classifiers on systematically biased data and (2) off-policy evaluation in Markov Decision Processes.
△ Less
Submitted 26 October, 2021; v1 submitted 8 June, 2021;
originally announced June 2021.
-
Sequential Domain Adaptation by Synthesizing Distributionally Robust Experts
Authors:
Bahar Taskesen,
Man-Chung Yue,
Jose Blanchet,
Daniel Kuhn,
Viet Anh Nguyen
Abstract:
Least squares estimators, when trained on a few target domain samples, may predict poorly. Supervised domain adaptation aims to improve the predictive accuracy by exploiting additional labeled training samples from a source distribution that is close to the target distribution. Given available data, we investigate novel strategies to synthesize a family of least squares estimator experts that are…
▽ More
Least squares estimators, when trained on a few target domain samples, may predict poorly. Supervised domain adaptation aims to improve the predictive accuracy by exploiting additional labeled training samples from a source distribution that is close to the target distribution. Given available data, we investigate novel strategies to synthesize a family of least squares estimator experts that are robust with regard to moment conditions. When these moment conditions are specified using Kullback-Leibler or Wasserstein-type divergences, we can find the robust estimators efficiently using convex optimization. We use the Bernstein online aggregation algorithm on the proposed family of robust experts to generate predictions for the sequential stream of target test samples. Numerical experiments on real data show that the robust strategies may outperform non-robust interpolations of the empirical least squares estimators.
△ Less
Submitted 1 June, 2021;
originally announced June 2021.
-
A Unified Theory of Robust and Distributionally Robust Optimization via the Primal-Worst-Equals-Dual-Best Principle
Authors:
Jianzhe Zhen,
Daniel Kuhn,
Wolfram Wiesemann
Abstract:
Robust and distributionally robust optimization are modeling paradigms for decision-making under uncertainty where the uncertain parameters are only known to reside in an uncertainty set or are governed by any probability distribution from within an ambiguity set, respectively, and a decision is sought that minimizes a cost function under the most adverse outcome of the uncertainty. In this paper,…
▽ More
Robust and distributionally robust optimization are modeling paradigms for decision-making under uncertainty where the uncertain parameters are only known to reside in an uncertainty set or are governed by any probability distribution from within an ambiguity set, respectively, and a decision is sought that minimizes a cost function under the most adverse outcome of the uncertainty. In this paper, we develop a rigorous and general theory of robust and distributionally robust nonlinear optimization using the language of convex analysis. Our framework is based on a generalized `primal-worst-equals-dual-best' principle that establishes strong duality between a semi-infinite primal worst and a non-convex dual best formulation, both of which admit finite convex reformulations. This principle offers an alternative formulation for robust optimization problems that obviates the need to mobilize the machinery of abstract semi-infinite duality theory to prove strong duality in distributionally robust optimization. We illustrate the modeling power of our approach through convex reformulations for distributionally robust optimization problems whose ambiguity sets are defined through general optimal transport distances, which generalize earlier results for Wasserstein ambiguity sets.
△ Less
Submitted 19 July, 2023; v1 submitted 3 May, 2021;
originally announced May 2021.
-
Semi-Discrete Optimal Transport: Hardness, Regularization and Numerical Solution
Authors:
Bahar Taskesen,
Soroosh Shafieezadeh-Abadeh,
Daniel Kuhn
Abstract:
Semi-discrete optimal transport problems, which evaluate the Wasserstein distance between a discrete and a generic (possibly non-discrete) probability measure, are believed to be computationally hard. Even though such problems are ubiquitous in statistics, machine learning and computer vision, however, this perception has not yet received a theoretical justification. To fill this gap, we prove tha…
▽ More
Semi-discrete optimal transport problems, which evaluate the Wasserstein distance between a discrete and a generic (possibly non-discrete) probability measure, are believed to be computationally hard. Even though such problems are ubiquitous in statistics, machine learning and computer vision, however, this perception has not yet received a theoretical justification. To fill this gap, we prove that computing the Wasserstein distance between a discrete probability measure supported on two points and the Lebesgue measure on the standard hypercube is already #P-hard. This insight prompts us to seek approximate solutions for semi-discrete optimal transport problems. We thus perturb the underlying transportation cost with an additive disturbance governed by an ambiguous probability distribution, and we introduce a distributionally robust dual optimal transport problem whose objective function is smoothed with the most adverse disturbance distributions from within a given ambiguity set. We further show that smoothing the dual objective function is equivalent to regularizing the primal objective function, and we identify several ambiguity sets that give rise to several known and new regularization schemes. As a byproduct, we discover an intimate relation between semi-discrete optimal transport problems and discrete choice models traditionally studied in psychology and economics. To solve the regularized optimal transport problems efficiently, we use a stochastic gradient descent algorithm with imprecise stochastic gradient oracles. A new convergence analysis reveals that this algorithm improves the best known convergence guarantee for semi-discrete optimal transport problems with entropic regularizers.
△ Less
Submitted 29 April, 2022; v1 submitted 10 March, 2021;
originally announced March 2021.
-
Small errors in random zeroth-order optimization are imaginary
Authors:
Wouter Jongeneel,
Man-Chung Yue,
Daniel Kuhn
Abstract:
Most zeroth-order optimization algorithms mimic a first-order algorithm but replace the gradient of the objective function with some gradient estimator that can be computed from a small number of function evaluations. This estimator is constructed randomly, and its expectation matches the gradient of a smooth approximation of the objective function whose quality improves as the underlying smoothin…
▽ More
Most zeroth-order optimization algorithms mimic a first-order algorithm but replace the gradient of the objective function with some gradient estimator that can be computed from a small number of function evaluations. This estimator is constructed randomly, and its expectation matches the gradient of a smooth approximation of the objective function whose quality improves as the underlying smoothing parameter $δ$ is reduced. Gradient estimators requiring a smaller number of function evaluations are preferable from a computational point of view. While estimators based on a single function evaluation can be obtained by use of the divergence theorem from vector calculus, their variance explodes as $δ$ tends to $0$. Estimators based on multiple function evaluations, on the other hand, suffer from numerical cancellation when $δ$ tends to $0$. To combat both effects simultaneously, we extend the objective function to the complex domain and construct a gradient estimator that evaluates the objective at a complex point whose coordinates have small imaginary parts of the order $δ$. As this estimator requires only one function evaluation, it is immune to cancellation. In addition, its variance remains bounded as $δ$ tends to $0$. We prove that zeroth-order algorithms that use our estimator offer the same theoretical convergence guarantees as the state-of-the-art methods. Numerical experiments suggest, however, that they often converge faster in practice.
△ Less
Submitted 19 March, 2024; v1 submitted 9 March, 2021;
originally announced March 2021.
-
Topological Linear System Identification via Moderate Deviations Theory
Authors:
Wouter Jongeneel,
Tobias Sutter,
Daniel Kuhn
Abstract:
Two dynamical systems are topologically equivalent when their phase-portraits can be morphed into each other by a homeomorphic coordinate transformation on the state space. The induced equivalence classes capture qualitative properties such as stability or the oscillatory nature of the state trajectories, for example. In this paper we develop a method to learn the topological class of an unknown s…
▽ More
Two dynamical systems are topologically equivalent when their phase-portraits can be morphed into each other by a homeomorphic coordinate transformation on the state space. The induced equivalence classes capture qualitative properties such as stability or the oscillatory nature of the state trajectories, for example. In this paper we develop a method to learn the topological class of an unknown stable system from a single trajectory of finitely many state observations. Using a moderate deviations principle for the least squares estimator of the unknown system matrix $θ$, we prove that the probability of misclassification decays exponentially with the number of observations at a rate that is proportional to the square of the smallest singular value of $θ$.
△ Less
Submitted 9 April, 2021; v1 submitted 5 March, 2021;
originally announced March 2021.
-
A Planner-Trader Decomposition for Multi-Market Hydro Scheduling
Authors:
Kilian Schindler,
Napat Rujeerapaiboon,
Daniel Kuhn,
Wolfram Wiesemann
Abstract:
Peak/off-peak spreads on European electricity forward and spot markets are eroding due to the ongoing nuclear phaseout in Germany and the steady growth in photovoltaic capacity. The reduced profitability of peak/off-peak arbitrage forces hydropower producers to recover part of their original profitability on the reserve markets. We propose a bi-layer stochastic programming framework for the optima…
▽ More
Peak/off-peak spreads on European electricity forward and spot markets are eroding due to the ongoing nuclear phaseout in Germany and the steady growth in photovoltaic capacity. The reduced profitability of peak/off-peak arbitrage forces hydropower producers to recover part of their original profitability on the reserve markets. We propose a bi-layer stochastic programming framework for the optimal operation of a fleet of interconnected hydropower plants that sells energy on both the spot and the reserve markets. The outer layer (the planner's problem) optimizes end-of-day reservoir filling levels over one year, whereas the inner layer (the trader's problem) selects optimal hourly market bids within each day. Using an information restriction whereby the planner prescribes the end-of-day reservoir targets one day in advance, we prove that the trader's problem simplifies from an infinite-dimensional stochastic program with 25 stages to a finite two-stage stochastic program with only two scenarios. Substituting this reformulation back into the outer layer and approximating the reservoir targets by affine decision rules allows us to simplify the planner's problem from an infinite-dimensional stochastic program with 365 stages to a two-stage stochastic program that can conveniently be solved via the sample average approximation. Numerical experiments based on a cascade in the Salzburg region of Austria demonstrate the effectiveness of the suggested framework.
△ Less
Submitted 2 September, 2022; v1 submitted 3 March, 2021;
originally announced March 2021.
-
Efficient Learning of a Linear Dynamical System with Stability Guarantees
Authors:
Wouter Jongeneel,
Tobias Sutter,
Daniel Kuhn
Abstract:
We propose a principled method for projecting an arbitrary square matrix to the non-convex set of asymptotically stable matrices. Leveraging ideas from large deviations theory, we show that this projection is optimal in an information-theoretic sense and that it simply amounts to shifting the initial matrix by an optimal linear quadratic feedback gain, which can be computed exactly and highly effi…
▽ More
We propose a principled method for projecting an arbitrary square matrix to the non-convex set of asymptotically stable matrices. Leveraging ideas from large deviations theory, we show that this projection is optimal in an information-theoretic sense and that it simply amounts to shifting the initial matrix by an optimal linear quadratic feedback gain, which can be computed exactly and highly efficiently by solving a standard linear quadratic regulator problem. The proposed approach allows us to learn the system matrix of a stable linear dynamical system from a single trajectory of correlated state observations. The resulting estimator is guaranteed to be stable and offers explicit statistical bounds on the estimation error.
△ Less
Submitted 13 June, 2022; v1 submitted 6 February, 2021;
originally announced February 2021.
-
A proof of the Erdős-Faber-Lovász conjecture
Authors:
Dong Yeap Kang,
Tom Kelly,
Daniela Kühn,
Abhishek Methuku,
Deryk Osthus
Abstract:
The Erdős-Faber-Lovász conjecture (posed in 1972) states that the chromatic index of any linear hypergraph on $n$ vertices is at most $n$. In this paper, we prove this conjecture for every large $n$. We also provide stability versions of this result, which confirm a prediction of Kahn.
The Erdős-Faber-Lovász conjecture (posed in 1972) states that the chromatic index of any linear hypergraph on $n$ vertices is at most $n$. In this paper, we prove this conjecture for every large $n$. We also provide stability versions of this result, which confirm a prediction of Kahn.
△ Less
Submitted 25 January, 2023; v1 submitted 12 January, 2021;
originally announced January 2021.
-
Path decompositions of tournaments
Authors:
António Girão,
Bertille Granet,
Daniela Kühn,
Allan Lo,
Deryk Osthus
Abstract:
In 1976, Alspach, Mason, and Pullman conjectured that any tournament $T$ of even order can be decomposed into exactly ${\rm ex}(T)$ paths, where ${\rm ex}(T):= \frac{1}{2}\sum_{v\in V(T)}|d_T^+(v)-d_T^-(v)|$. We prove this conjecture for all sufficiently large tournaments. We also prove an asymptotically optimal result for tournaments of odd order.
In 1976, Alspach, Mason, and Pullman conjectured that any tournament $T$ of even order can be decomposed into exactly ${\rm ex}(T)$ paths, where ${\rm ex}(T):= \frac{1}{2}\sum_{v\in V(T)}|d_T^+(v)-d_T^-(v)|$. We prove this conjecture for all sufficiently large tournaments. We also prove an asymptotically optimal result for tournaments of odd order.
△ Less
Submitted 28 July, 2022; v1 submitted 27 October, 2020;
originally announced October 2020.
-
A Pareto Dominance Principle for Data-Driven Optimization
Authors:
Tobias Sutter,
Bart P. G. Van Parys,
Daniel Kuhn
Abstract:
We propose a statistically optimal approach to construct data-driven decisions for stochastic optimization problems. Fundamentally, a data-driven decision is simply a function that maps the available training data to a feasible action. It can always be expressed as the minimizer of a surrogate optimization model constructed from the data. The quality of a data-driven decision is measured by its ou…
▽ More
We propose a statistically optimal approach to construct data-driven decisions for stochastic optimization problems. Fundamentally, a data-driven decision is simply a function that maps the available training data to a feasible action. It can always be expressed as the minimizer of a surrogate optimization model constructed from the data. The quality of a data-driven decision is measured by its out-of-sample risk. An additional quality measure is its out-of-sample disappointment, which we define as the probability that the out-of-sample risk exceeds the optimal value of the surrogate optimization model. An ideal data-driven decision should minimize the out-of-sample risk simultaneously with respect to every conceivable probability measure as the true measure is unkown. Unfortunately, such ideal data-driven decisions are generally unavailable. This prompts us to seek data-driven decisions that minimize the in-sample risk subject to an upper bound on the out-of-sample disappointment. We prove that such Pareto-dominant data-driven decisions exist under conditions that allow for interesting applications: the unknown data-generating probability measure must belong to a parametric ambiguity set, and the corresponding parameters must admit a sufficient statistic that satisfies a large deviation principle. We can further prove that the surrogate optimization model must be a distributionally robust optimization problem constructed from the sufficient statistic and the rate function of its large deviation principle. Hence the optimal method for map** data to decisions is to solve a distributionally robust optimization model. Maybe surprisingly, this result holds even when the training data is non-i.i.d. Our analysis reveals how the structural properties of the data-generating stochastic process impact the shape of the ambiguity set underlying the optimal distributionally robust model.
△ Less
Submitted 14 December, 2023; v1 submitted 13 October, 2020;
originally announced October 2020.
-
New bounds on the size of Nearly Perfect Matchings in almost regular hypergraphs
Authors:
Dong Yeap Kang,
Daniela Kühn,
Abhishek Methuku,
Deryk Osthus
Abstract:
Let $H$ be a $k$-uniform $D$-regular simple hypergraph on $N$ vertices. Based on an analysis of the Rödl nibble, Alon, Kim and Spencer (1997) proved that if $k \ge 3$, then $H$ contains a matching covering all but at most $ND^{-1/(k-1)+o(1)}$ vertices, and asked whether this bound is tight. In this paper we improve their bound by showing that for all $k > 3$, $H$ contains a matching covering all b…
▽ More
Let $H$ be a $k$-uniform $D$-regular simple hypergraph on $N$ vertices. Based on an analysis of the Rödl nibble, Alon, Kim and Spencer (1997) proved that if $k \ge 3$, then $H$ contains a matching covering all but at most $ND^{-1/(k-1)+o(1)}$ vertices, and asked whether this bound is tight. In this paper we improve their bound by showing that for all $k > 3$, $H$ contains a matching covering all but at most $ND^{-1/(k-1)-η}$ vertices for some $η= Θ(k^{-3}) > 0$, when $N$ and $D$ are sufficiently large. Our approach consists of showing that the Rödl nibble process not only constructs a large matching but it also produces many well-distributed `augmenting stars' which can then be used to significantly improve the matching constructed by the Rödl nibble process.
Based on this, we also improve the results of Kostochka and Rödl (1998) and Vu (2000) on the size of matchings in almost regular hypergraphs with small codegree. As a consequence, we improve the best known bounds on the size of large matchings in combinatorial designs with general parameters. Finally, we improve the bounds of Molloy and Reed (2000) on the chromatic index of hypergraphs with small codegree (which can be applied to improve the best known bounds on the chromatic index of Steiner triple systems and more general designs).
△ Less
Submitted 8 October, 2020;
originally announced October 2020.
-
Extremal aspects of graph and hypergraph decomposition problems
Authors:
Stefan Glock,
Daniela Kühn,
Deryk Osthus
Abstract:
We survey recent advances in the theory of graph and hypergraph decompositions, with a focus on extremal results involving minimum degree conditions. We also collect a number of intriguing open problems, and formulate new ones.
We survey recent advances in the theory of graph and hypergraph decompositions, with a focus on extremal results involving minimum degree conditions. We also collect a number of intriguing open problems, and formulate new ones.
△ Less
Submitted 25 June, 2021; v1 submitted 3 August, 2020;
originally announced August 2020.
-
Hamiltonicity of random subgraphs of the hypercube
Authors:
Padraig Condon,
Alberto Espuny Díaz,
António Girão,
Daniela Kühn,
Deryk Osthus
Abstract:
We study Hamiltonicity in random subgraphs of the hypercube $\mathcal{Q}^n$. Our first main theorem is an optimal hitting time result. Consider the random process which includes the edges of $\mathcal{Q}^n$ according to a uniformly chosen random ordering. Then, with high probability, as soon as the graph produced by this process has minimum degree $2k$, it contains $k$ edge-disjoint Hamilton cycle…
▽ More
We study Hamiltonicity in random subgraphs of the hypercube $\mathcal{Q}^n$. Our first main theorem is an optimal hitting time result. Consider the random process which includes the edges of $\mathcal{Q}^n$ according to a uniformly chosen random ordering. Then, with high probability, as soon as the graph produced by this process has minimum degree $2k$, it contains $k$ edge-disjoint Hamilton cycles, for any fixed $k\in\mathbb{N}$. Secondly, we obtain a perturbation result: if $H\subseteq\mathcal{Q}^n$ satisfies $δ(H)\geqαn$ with $α>0$ fixed and we consider a random binomial subgraph $\mathcal{Q}^n_p$ of $\mathcal{Q}^n$ with $p\in(0,1]$ fixed, then with high probability $H\cup\mathcal{Q}^n_p$ contains $k$ edge-disjoint Hamilton cycles, for any fixed $k\in\mathbb{N}$. In particular, both results resolve a long standing conjecture, posed e.g. by Bollobás, that the threshold probability for Hamiltonicity in the random binomial subgraph of the hypercube equals $1/2$. Our techniques also show that, with high probability, for all fixed $p\in(0,1]$ the graph $\mathcal{Q}^n_p$ contains an almost spanning cycle. Our methods involve branching processes, the Rödl nibble, and absorption.
△ Less
Submitted 13 August, 2022; v1 submitted 6 July, 2020;
originally announced July 2020.
-
Almost all optimally coloured complete graphs contain a rainbow Hamilton path
Authors:
Stephen Gould,
Tom Kelly,
Daniela Kühn,
Deryk Osthus
Abstract:
A subgraph $H$ of an edge-coloured graph is called rainbow if all of the edges of $H$ have different colours. In 1989, Andersen conjectured that every proper edge-colouring of $K_{n}$ admits a rainbow path of length $n-2$. We show that almost all optimal edge-colourings of $K_{n}$ admit both (i) a rainbow Hamilton path and (ii) a rainbow cycle using all of the colours. This result demonstrates tha…
▽ More
A subgraph $H$ of an edge-coloured graph is called rainbow if all of the edges of $H$ have different colours. In 1989, Andersen conjectured that every proper edge-colouring of $K_{n}$ admits a rainbow path of length $n-2$. We show that almost all optimal edge-colourings of $K_{n}$ admit both (i) a rainbow Hamilton path and (ii) a rainbow cycle using all of the colours. This result demonstrates that Andersen's Conjecture holds for almost all optimal edge-colourings of $K_{n}$ and answers a recent question of Ferber, Jain, and Sudakov. Our result also has applications to the existence of transversals in random symmetric Latin squares.
△ Less
Submitted 21 April, 2022; v1 submitted 1 July, 2020;
originally announced July 2020.
-
Reliable Frequency Regulation through Vehicle-to-Grid: Encoding Legislation with Robust Constraints
Authors:
Dirk Lauinger,
François Vuille,
Daniel Kuhn
Abstract:
Problem definition: Vehicle-to-grid increases the low utilization rate of privately owned electric vehicles by making their batteries available to electricity grids. We formulate a robust optimization problem that maximizes a vehicle owner's expected profit from selling primary frequency regulation to the grid and guarantees that market commitments are met at all times for all frequency deviation…
▽ More
Problem definition: Vehicle-to-grid increases the low utilization rate of privately owned electric vehicles by making their batteries available to electricity grids. We formulate a robust optimization problem that maximizes a vehicle owner's expected profit from selling primary frequency regulation to the grid and guarantees that market commitments are met at all times for all frequency deviation trajectories in a functional uncertainty set that encodes applicable legislation. Faithfully modeling the energy conversion losses during battery charging and discharging renders this optimization problem non-convex. Methodology/results: By exploiting a total unimodularity property of the uncertainty set and an exact linear decision rule reformulation, we prove that this non-convex robust optimization problem with functional uncertainties is equivalent to a tractable linear program. Through extensive numerical experiments using real-world data, we quantify the economic value of vehicle-to-grid and elucidate the financial incentives of vehicle owners, aggregators, equipment manufacturers, and regulators. Managerial implications: We find that the prevailing penalties for non-delivery of promised regulation power are too low to incentivize vehicle owners to honor the delivery guarantees given to grid operators.
△ Less
Submitted 9 February, 2024; v1 submitted 12 May, 2020;
originally announced May 2020.
-
On Linear Optimization over Wasserstein Balls
Authors:
Man-Chung Yue,
Daniel Kuhn,
Wolfram Wiesemann
Abstract:
Wasserstein balls, which contain all probability measures within a pre-specified Wasserstein distance to a reference measure, have recently enjoyed wide popularity in the distributionally robust optimization and machine learning communities to formulate and solve data-driven optimization problems with rigorous statistical guarantees. In this technical note we prove that the Wasserstein ball is wea…
▽ More
Wasserstein balls, which contain all probability measures within a pre-specified Wasserstein distance to a reference measure, have recently enjoyed wide popularity in the distributionally robust optimization and machine learning communities to formulate and solve data-driven optimization problems with rigorous statistical guarantees. In this technical note we prove that the Wasserstein ball is weakly compact under mild conditions, and we offer necessary and sufficient conditions for the existence of optimal solutions. We also characterize the sparsity of solutions if the Wasserstein ball is centred at a discrete reference measure. In comparison with the existing literature, which has proved similar results under different conditions, our proofs are self-contained and shorter, yet mathematically rigorous, and our necessary and sufficient conditions for the existence of optimal solutions are easily verifiable in practice.
△ Less
Submitted 6 June, 2021; v1 submitted 15 April, 2020;
originally announced April 2020.
-
Counting Hamilton cycles in Dirac hypergraphs
Authors:
Stefan Glock,
Stephen Gould,
Felix Joos,
Daniela Kühn,
Deryk Osthus
Abstract:
A tight Hamilton cycle in a $k$-uniform hypergraph ($k$-graph) $G$ is a cyclic ordering of the vertices of $G$ such that every set of $k$ consecutive vertices in the ordering forms an edge. Rödl, Ruciński, and Szemerédi proved that for $k\geq 3$, every $k$-graph on $n$ vertices with minimum codegree at least $n/2+o(n)$ contains a tight Hamilton cycle. We show that the number of tight Hamilton cycl…
▽ More
A tight Hamilton cycle in a $k$-uniform hypergraph ($k$-graph) $G$ is a cyclic ordering of the vertices of $G$ such that every set of $k$ consecutive vertices in the ordering forms an edge. Rödl, Ruciński, and Szemerédi proved that for $k\geq 3$, every $k$-graph on $n$ vertices with minimum codegree at least $n/2+o(n)$ contains a tight Hamilton cycle. We show that the number of tight Hamilton cycles in such $k$-graphs is $\exp(n\ln n-Θ(n))$. As a corollary, we obtain a similar estimate on the number of Hamilton $\ell$-cycles in such $k$-graphs for all $\ell\in\{0,\dots,k-1\}$, which makes progress on a question of Ferber, Krivelevich and Sudakov.
△ Less
Submitted 10 November, 2020; v1 submitted 20 November, 2019;
originally announced November 2019.
-
Path and cycle decompositions of dense graphs
Authors:
António Girão,
Bertille Granet,
Daniela Kühn,
Deryk Osthus
Abstract:
We make progress on three long standing conjectures from the 1960s about path and cycle decompositions of graphs. Gallai conjectured that any connected graph on $n$ vertices can be decomposed into at most $\left\lceil \frac{n}{2}\right\rceil$ paths, while a conjecture of Hajós states that any Eulerian graph on $n$ vertices can be decomposed into at most $\left\lfloor \frac{n-1}{2}\right\rfloor$ cy…
▽ More
We make progress on three long standing conjectures from the 1960s about path and cycle decompositions of graphs. Gallai conjectured that any connected graph on $n$ vertices can be decomposed into at most $\left\lceil \frac{n}{2}\right\rceil$ paths, while a conjecture of Hajós states that any Eulerian graph on $n$ vertices can be decomposed into at most $\left\lfloor \frac{n-1}{2}\right\rfloor$ cycles. The Erdős-Gallai conjecture states that any graph on $n$ vertices can be decomposed into $O(n)$ cycles and edges.
We show that if $G$ is a sufficiently large graph on $n$ vertices with linear minimum degree, then the following hold.
(i) $G$ can be decomposed into at most $\frac{n}{2}+o(n)$ paths.
(ii) If $G$ is Eulerian, then it can be decomposed into at most $\frac{n}{2}+o(n)$ cycles.
(iii) $G$ can be decomposed into at most $\frac{3 n}{2}+o(n)$ cycles and edges.
If in addition $G$ satisfies a weak expansion property, we asymptotically determine the required number of paths/cycles for each such $G$.
(iv) $G$ can be decomposed into $\max \left\{\frac{odd(G)}{2},\frac{Δ(G)}{2}\right\}+o(n)$ paths, where $odd(G)$ is the number of odd-degree vertices of $G$.
(v) If $G$ is Eulerian, then it can be decomposed into $\frac{Δ(G)}{2}+o(n)$ cycles.
All bounds in (i)-(v) are asymptotically best possible.
△ Less
Submitted 15 March, 2021; v1 submitted 13 November, 2019;
originally announced November 2019.
-
Bridging Bayesian and Minimax Mean Square Error Estimation via Wasserstein Distributionally Robust Optimization
Authors:
Viet Anh Nguyen,
Soroosh Shafieezadeh-Abadeh,
Daniel Kuhn,
Peyman Mohajerin Esfahani
Abstract:
We introduce a distributionally robust minimium mean square error estimation model with a Wasserstein ambiguity set to recover an unknown signal from a noisy observation. The proposed model can be viewed as a zero-sum game between a statistician choosing an estimator -- that is, a measurable function of the observation -- and a fictitious adversary choosing a prior -- that is, a pair of signal and…
▽ More
We introduce a distributionally robust minimium mean square error estimation model with a Wasserstein ambiguity set to recover an unknown signal from a noisy observation. The proposed model can be viewed as a zero-sum game between a statistician choosing an estimator -- that is, a measurable function of the observation -- and a fictitious adversary choosing a prior -- that is, a pair of signal and noise distributions ranging over independent Wasserstein balls -- with the goal to minimize and maximize the expected squared estimation error, respectively. We show that if the Wasserstein balls are centered at normal distributions, then the zero-sum game admits a Nash equilibrium, where the players' optimal strategies are given by an {\em affine} estimator and a {\em normal} prior, respectively. We further prove that this Nash equilibrium can be computed by solving a tractable convex program. Finally, we develop a Frank-Wolfe algorithm that can solve this convex program orders of magnitude faster than state-of-the-art general purpose solvers. We show that this algorithm enjoys a linear convergence rate and that its direction-finding subproblems can be solved in quasi-closed form.
△ Less
Submitted 27 January, 2021; v1 submitted 8 November, 2019;
originally announced November 2019.
-
Optimistic Distributionally Robust Optimization for Nonparametric Likelihood Approximation
Authors:
Viet Anh Nguyen,
Soroosh Shafieezadeh-Abadeh,
Man-Chung Yue,
Daniel Kuhn,
Wolfram Wiesemann
Abstract:
The likelihood function is a fundamental component in Bayesian statistics. However, evaluating the likelihood of an observation is computationally intractable in many applications. In this paper, we propose a non-parametric approximation of the likelihood that identifies a probability measure which lies in the neighborhood of the nominal measure and that maximizes the probability of observing the…
▽ More
The likelihood function is a fundamental component in Bayesian statistics. However, evaluating the likelihood of an observation is computationally intractable in many applications. In this paper, we propose a non-parametric approximation of the likelihood that identifies a probability measure which lies in the neighborhood of the nominal measure and that maximizes the probability of observing the given sample point. We show that when the neighborhood is constructed by the Kullback-Leibler divergence, by moment conditions or by the Wasserstein distance, then our \textit{optimistic likelihood} can be determined through the solution of a convex optimization problem, and it admits an analytical expression in particular cases. We also show that the posterior inference problem with our optimistic likelihood approximation enjoys strong theoretical performance guarantees, and it performs competitively in a probabilistic classification task.
△ Less
Submitted 23 October, 2019;
originally announced October 2019.
-
Calculating Optimistic Likelihoods Using (Geodesically) Convex Optimization
Authors:
Viet Anh Nguyen,
Soroosh Shafieezadeh-Abadeh,
Man-Chung Yue,
Daniel Kuhn,
Wolfram Wiesemann
Abstract:
A fundamental problem arising in many areas of machine learning is the evaluation of the likelihood of a given observation under different nominal distributions. Frequently, these nominal distributions are themselves estimated from data, which makes them susceptible to estimation errors. We thus propose to replace each nominal distribution with an ambiguity set containing all distributions in its…
▽ More
A fundamental problem arising in many areas of machine learning is the evaluation of the likelihood of a given observation under different nominal distributions. Frequently, these nominal distributions are themselves estimated from data, which makes them susceptible to estimation errors. We thus propose to replace each nominal distribution with an ambiguity set containing all distributions in its vicinity and to evaluate an \emph{optimistic likelihood}, that is, the maximum of the likelihood over all distributions in the ambiguity set. When the proximity of distributions is quantified by the Fisher-Rao distance or the Kullback-Leibler divergence, the emerging optimistic likelihoods can be computed efficiently using either geodesic or standard convex optimization techniques. We showcase the advantages of working with optimistic likelihoods on a classification problem using synthetic as well as empirical data.
△ Less
Submitted 17 October, 2019;
originally announced October 2019.
-
Wasserstein Distributionally Robust Optimization: Theory and Applications in Machine Learning
Authors:
Daniel Kuhn,
Peyman Mohajerin Esfahani,
Viet Anh Nguyen,
Soroosh Shafieezadeh-Abadeh
Abstract:
Many decision problems in science, engineering and economics are affected by uncertain parameters whose distribution is only indirectly observable through samples. The goal of data-driven decision-making is to learn a decision from finitely many training samples that will perform well on unseen test samples. This learning task is difficult even if all training and test samples are drawn from the s…
▽ More
Many decision problems in science, engineering and economics are affected by uncertain parameters whose distribution is only indirectly observable through samples. The goal of data-driven decision-making is to learn a decision from finitely many training samples that will perform well on unseen test samples. This learning task is difficult even if all training and test samples are drawn from the same distribution---especially if the dimension of the uncertainty is large relative to the training sample size. Wasserstein distributionally robust optimization seeks data-driven decisions that perform well under the most adverse distribution within a certain Wasserstein distance from a nominal distribution constructed from the training samples. In this tutorial we will argue that this approach has many conceptual and computational benefits. Most prominently, the optimal decisions can often be computed by solving tractable convex optimization problems, and they enjoy rigorous out-of-sample and asymptotic consistency guarantees. We will also show that Wasserstein distributionally robust optimization has interesting ramifications for statistical learning and motivates new approaches for fundamental learning tasks such as classification, regression, maximum likelihood estimation or minimum mean square error estimation, among others.
△ Less
Submitted 23 August, 2019;
originally announced August 2019.
-
Dirac's theorem for random regular graphs
Authors:
Padraig Condon,
Alberto Espuny Díaz,
António Girão,
Daniela Kühn,
Deryk Osthus
Abstract:
We prove a `resilience' version of Dirac's theorem in the setting of random regular graphs. More precisely, we show that, whenever $d$ is sufficiently large compared to $\varepsilon>0$, a.a.s. the following holds: let $G'$ be any subgraph of the random $n$-vertex $d$-regular graph $G_{n,d}$ with minimum degree at least $(1/2+\varepsilon)d$. Then $G'$ is Hamiltonian.
This proves a conjecture of B…
▽ More
We prove a `resilience' version of Dirac's theorem in the setting of random regular graphs. More precisely, we show that, whenever $d$ is sufficiently large compared to $\varepsilon>0$, a.a.s. the following holds: let $G'$ be any subgraph of the random $n$-vertex $d$-regular graph $G_{n,d}$ with minimum degree at least $(1/2+\varepsilon)d$. Then $G'$ is Hamiltonian.
This proves a conjecture of Ben-Shimon, Krivelevich and Sudakov. Our result is best possible: firstly, the condition that $d$ is large cannot be omitted, and secondly, the minimum degree bound cannot be improved.
△ Less
Submitted 23 June, 2020; v1 submitted 12 March, 2019;
originally announced March 2019.
-
Decompositions into isomorphic rainbow spanning trees
Authors:
Stefan Glock,
Daniela Kühn,
Richard Montgomery,
Deryk Osthus
Abstract:
A subgraph of an edge-coloured graph is called rainbow if all its edges have distinct colours. Our main result implies that, given any optimal colouring of a sufficiently large complete graph $K_{2n}$, there exists a decomposition of $K_{2n}$ into isomorphic rainbow spanning trees. This settles conjectures of Brualdi--Hollingsworth (from 1996) and Constantine (from 2002) for large graphs.
A subgraph of an edge-coloured graph is called rainbow if all its edges have distinct colours. Our main result implies that, given any optimal colouring of a sufficiently large complete graph $K_{2n}$, there exists a decomposition of $K_{2n}$ into isomorphic rainbow spanning trees. This settles conjectures of Brualdi--Hollingsworth (from 1996) and Constantine (from 2002) for large graphs.
△ Less
Submitted 6 March, 2020; v1 submitted 11 March, 2019;
originally announced March 2019.
-
Resilient degree sequences with respect to Hamilton cycles and matchings in random graphs
Authors:
Padraig Condon,
Alberto Espuny Díaz,
Jaehoon Kim,
Daniela Kühn,
Deryk Osthus
Abstract:
Pósa's theorem states that any graph $G$ whose degree sequence $d_1 \le \ldots \le d_n$ satisfies $d_i \ge i+1$ for all $i < n/2$ has a Hamilton cycle. This degree condition is best possible. We show that a similar result holds for suitable subgraphs $G$ of random graphs, i.e. we prove a `resilience version' of Pósa's theorem: if $pn \ge C \log n$ and the $i$-th vertex degree (ordered increasingly…
▽ More
Pósa's theorem states that any graph $G$ whose degree sequence $d_1 \le \ldots \le d_n$ satisfies $d_i \ge i+1$ for all $i < n/2$ has a Hamilton cycle. This degree condition is best possible. We show that a similar result holds for suitable subgraphs $G$ of random graphs, i.e. we prove a `resilience version' of Pósa's theorem: if $pn \ge C \log n$ and the $i$-th vertex degree (ordered increasingly) of $G \subseteq G_{n,p}$ is at least $(i+o(n))p$ for all $i<n/2$, then $G$ has a Hamilton cycle. This is essentially best possible and strengthens a resilience version of Dirac's theorem obtained by Lee and Sudakov.
Chvátal's theorem generalises Pósa's theorem and characterises all degree sequences which ensure the existence of a Hamilton cycle. We show that a natural guess for a resilience version of Chvátal's theorem fails to be true. We formulate a conjecture which would repair this guess, and show that the corresponding degree conditions ensure the existence of a perfect matching in any subgraph of $G_{n,p}$ which satisfies these conditions. This provides an asymptotic characterisation of all degree sequences which resiliently guarantee the existence of a perfect matching.
△ Less
Submitted 2 December, 2019; v1 submitted 29 October, 2018;
originally announced October 2018.
-
Wasserstein Distributionally Robust Kalman Filtering
Authors:
Soroosh Shafieezadeh-Abadeh,
Viet Anh Nguyen,
Daniel Kuhn,
Peyman Mohajerin Esfahani
Abstract:
We study a distributionally robust mean square error estimation problem over a nonconvex Wasserstein ambiguity set containing only normal distributions. We show that the optimal estimator and the least favorable distribution form a Nash equilibrium. Despite the non-convex nature of the ambiguity set, we prove that the estimation problem is equivalent to a tractable convex program. We further devis…
▽ More
We study a distributionally robust mean square error estimation problem over a nonconvex Wasserstein ambiguity set containing only normal distributions. We show that the optimal estimator and the least favorable distribution form a Nash equilibrium. Despite the non-convex nature of the ambiguity set, we prove that the estimation problem is equivalent to a tractable convex program. We further devise a Frank-Wolfe algorithm for this convex program whose direction-searching subproblem can be solved in a quasi-closed form. Using these ingredients, we introduce a distributionally robust Kalman filter that hedges against model risk.
△ Less
Submitted 1 October, 2018; v1 submitted 24 September, 2018;
originally announced September 2018.
-
Data-Driven Chance Constrained Programs over Wasserstein Balls
Authors:
Zhi Chen,
Daniel Kuhn,
Wolfram Wiesemann
Abstract:
We provide an exact deterministic reformulation for data-driven chance constrained programs over Wasserstein balls. For individual chance constraints as well as joint chance constraints with right-hand side uncertainty, our reformulation amounts to a mixed-integer conic program. In the special case of a Wasserstein ball with the $1$-norm or the $\infty$-norm, the cone is the nonnegative orthant, a…
▽ More
We provide an exact deterministic reformulation for data-driven chance constrained programs over Wasserstein balls. For individual chance constraints as well as joint chance constraints with right-hand side uncertainty, our reformulation amounts to a mixed-integer conic program. In the special case of a Wasserstein ball with the $1$-norm or the $\infty$-norm, the cone is the nonnegative orthant, and the chance constrained program can be reformulated as a mixed-integer linear program. Our reformulation compares favourably to several state-of-the-art data-driven optimization schemes in our numerical experiments.
△ Less
Submitted 31 May, 2022; v1 submitted 1 September, 2018;
originally announced September 2018.