Search | arXiv e-print repository

arXiv:2405.20124 [pdf, other]

A Geometric Unification of Distributionally Robust Covariance Estimators: Shrinking the Spectrum by Inflating the Ambiguity Set

Authors: Man-Chung Yue, Yves Rychener, Daniel Kuhn, Viet Anh Nguyen

Abstract: The state-of-the-art methods for estimating high-dimensional covariance matrices all shrink the eigenvalues of the sample covariance matrix towards a data-insensitive shrinkage target. The underlying shrinkage transformation is either chosen heuristically - without compelling theoretical justification - or optimally in view of restrictive distributional assumptions. In this paper, we propose a pri… ▽ More The state-of-the-art methods for estimating high-dimensional covariance matrices all shrink the eigenvalues of the sample covariance matrix towards a data-insensitive shrinkage target. The underlying shrinkage transformation is either chosen heuristically - without compelling theoretical justification - or optimally in view of restrictive distributional assumptions. In this paper, we propose a principled approach to construct covariance estimators without imposing restrictive assumptions. That is, we study distributionally robust covariance estimation problems that minimize the worst-case Frobenius error with respect to all data distributions close to a nominal distribution, where the proximity of distributions is measured via a divergence on the space of covariance matrices. We identify mild conditions on this divergence under which the resulting minimizers represent shrinkage estimators. We show that the corresponding shrinkage transformations are intimately related to the geometrical properties of the underlying divergence. We also prove that our robust estimators are efficiently computable and asymptotically consistent and that they enjoy finite-sample performance guarantees. We exemplify our general methodology by synthesizing explicit estimators induced by the Kullback-Leibler, Fisher-Rao, and Wasserstein divergences. Numerical experiments based on synthetic and real data show that our robust estimators are competitive with state-of-the-art estimators. △ Less

Submitted 30 May, 2024; originally announced May 2024.

arXiv:2311.07411 [pdf, ps, other]

A Large Deviations Perspective on Policy Gradient Algorithms

Authors: Wouter Jongeneel, Daniel Kuhn, Mengmeng Li

Abstract: Motivated by policy gradient methods in the context of reinforcement learning, we identify a large deviation rate function for the iterates generated by stochastic gradient descent for possibly non-convex objectives satisfying a Polyak-Łojasiewicz condition. Leveraging the contraction principle from large deviations theory, we illustrate the potential of this result by showing how convergence prop… ▽ More Motivated by policy gradient methods in the context of reinforcement learning, we identify a large deviation rate function for the iterates generated by stochastic gradient descent for possibly non-convex objectives satisfying a Polyak-Łojasiewicz condition. Leveraging the contraction principle from large deviations theory, we illustrate the potential of this result by showing how convergence properties of policy gradient with a softmax parametrization and an entropy regularized objective can be naturally extended to a wide spectrum of other policy parametrizations. △ Less

Submitted 3 June, 2024; v1 submitted 13 November, 2023; originally announced November 2023.

Comments: v3; comments are welcome

MSC Class: 60F10; 90C26

arXiv:2310.18535 [pdf, other]

Contextual Stochastic Bilevel Optimization

Authors: Yifan Hu, Jie Wang, Yao Xie, Andreas Krause, Daniel Kuhn

Abstract: We introduce contextual stochastic bilevel optimization (CSBO) -- a stochastic bilevel optimization framework with the lower-level problem minimizing an expectation conditioned on some contextual information and the upper-level decision variable. This framework extends classical stochastic bilevel optimization when the lower-level decision maker responds optimally not only to the decision of the u… ▽ More We introduce contextual stochastic bilevel optimization (CSBO) -- a stochastic bilevel optimization framework with the lower-level problem minimizing an expectation conditioned on some contextual information and the upper-level decision variable. This framework extends classical stochastic bilevel optimization when the lower-level decision maker responds optimally not only to the decision of the upper-level decision maker but also to some side information and when there are multiple or even infinite many followers. It captures important applications such as meta-learning, personalized federated learning, end-to-end learning, and Wasserstein distributionally robust optimization with side information (WDRO-SI). Due to the presence of contextual information, existing single-loop methods for classical stochastic bilevel optimization are unable to converge. To overcome this challenge, we introduce an efficient double-loop gradient method based on the Multilevel Monte-Carlo (MLMC) technique and establish its sample and computational complexities. When specialized to stochastic nonconvex optimization, our method matches existing lower bounds. For meta-learning, the complexity of our method does not depend on the number of tasks. Numerical experiments further validate our theoretical results. △ Less

Submitted 27 October, 2023; originally announced October 2023.

Comments: The paper is accepted by NeurIPS 2023

arXiv:2308.05414 [pdf, other]

Unifying Distributionally Robust Optimization via Optimal Transport Theory

Authors: Jose Blanchet, Daniel Kuhn, Jia** Li, Bahar Taskesen

Abstract: In the past few years, there has been considerable interest in two prominent approaches for Distributionally Robust Optimization (DRO): Divergence-based and Wasserstein-based methods. The divergence approach models misspecification in terms of likelihood ratios, while the latter models it through a measure of distance or cost in actual outcomes. Building upon these advances, this paper introduces… ▽ More In the past few years, there has been considerable interest in two prominent approaches for Distributionally Robust Optimization (DRO): Divergence-based and Wasserstein-based methods. The divergence approach models misspecification in terms of likelihood ratios, while the latter models it through a measure of distance or cost in actual outcomes. Building upon these advances, this paper introduces a novel approach that unifies these methods into a single framework based on optimal transport (OT) with conditional moment constraints. Our proposed approach, for example, makes it possible for optimal adversarial distributions to simultaneously perturb likelihood and outcomes, while producing an optimal (in an optimal transport sense) coupling between the baseline model and the adversarial model.Additionally, the paper investigates several duality results and presents tractable reformulations that enhance the practical applicability of this unified framework. △ Less

Submitted 10 August, 2023; originally announced August 2023.

arXiv:2306.04174 [pdf, other]

End-to-End Learning for Stochastic Optimization: A Bayesian Perspective

Authors: Yves Rychener, Daniel Kuhn, Tobias Sutter

Abstract: We develop a principled approach to end-to-end learning in stochastic optimization. First, we show that the standard end-to-end learning algorithm admits a Bayesian interpretation and trains a posterior Bayes action map. Building on the insights of this analysis, we then propose new end-to-end learning algorithms for training decision maps that output solutions of empirical risk minimization and d… ▽ More We develop a principled approach to end-to-end learning in stochastic optimization. First, we show that the standard end-to-end learning algorithm admits a Bayesian interpretation and trains a posterior Bayes action map. Building on the insights of this analysis, we then propose new end-to-end learning algorithms for training decision maps that output solutions of empirical risk minimization and distributionally robust optimization problems, two dominant modeling paradigms in optimization under uncertainty. Numerical results for a synthetic newsvendor problem illustrate the key differences between alternative training schemes. We also investigate an economic dispatch problem based on real data to showcase the impact of the neural network architecture of the decision maps on their test performance. △ Less

Submitted 11 June, 2023; v1 submitted 7 June, 2023; originally announced June 2023.

Comments: Accepted at ICML 2023

arXiv:2306.02987 [pdf, other]

doi 10.1016/j.ejor.2024.03.022

Frequency Regulation with Storage: On Losses and Profits

Authors: Dirk Lauinger, François Vuille, Daniel Kuhn

Abstract: Low-carbon societies will need to store vast amounts of electricity to balance intermittent generation from wind and solar energy, for example, through frequency regulation. Here, we derive an analytical solution to the decision-making problem of storage operators who sell frequency regulation power to grid operators and trade electricity on day-ahead markets. Mathematically, we treat future frequ… ▽ More Low-carbon societies will need to store vast amounts of electricity to balance intermittent generation from wind and solar energy, for example, through frequency regulation. Here, we derive an analytical solution to the decision-making problem of storage operators who sell frequency regulation power to grid operators and trade electricity on day-ahead markets. Mathematically, we treat future frequency deviation trajectories as functional uncertainties in a receding horizon robust optimization problem. We constrain the expected terminal state-of-charge to be equal to some target to allow storage operators to make good decisions not only for the present but also the future. Thanks to this constraint, the amount of electricity traded on day-ahead markets is an implicit function of the regulation power sold to grid operators. The implicit function quantifies the amount of power that needs to be purchased to cover the expected energy loss that results from providing frequency regulation. We show how the marginal cost associated with the expected energy loss decreases with roundtrip efficiency and increases with frequency deviation dispersion. We find that the profits from frequency regulation over the lifetime of energy-constrained storage devices are roughly inversely proportional to the length of time for which regulation power must be committed. △ Less

Submitted 26 March, 2024; v1 submitted 5 June, 2023; originally announced June 2023.

arXiv:2305.19004 [pdf, ps, other]

Policy Gradient Algorithms for Robust MDPs with Non-Rectangular Uncertainty Sets

Authors: Mengmeng Li, Daniel Kuhn, Tobias Sutter

Abstract: We propose policy gradient algorithms for robust infinite-horizon Markov decision processes (MDPs) with non-rectangular uncertainty sets, thereby addressing an open challenge in the robust MDP literature. Indeed, uncertainty sets that display statistical optimality properties and make optimal use of limited data often fail to be rectangular. Unfortunately, the corresponding robust MDPs cannot be s… ▽ More We propose policy gradient algorithms for robust infinite-horizon Markov decision processes (MDPs) with non-rectangular uncertainty sets, thereby addressing an open challenge in the robust MDP literature. Indeed, uncertainty sets that display statistical optimality properties and make optimal use of limited data often fail to be rectangular. Unfortunately, the corresponding robust MDPs cannot be solved with dynamic programming techniques and are in fact provably intractable. We first present a randomized projected Langevin dynamics algorithm that solves the robust policy evaluation problem to global optimality but is inefficient. We also propose a deterministic policy gradient method that is efficient but solves the robust policy evaluation problem only approximately, and we prove that the approximation error scales with a new measure of non-rectangularity of the uncertainty set. Finally, we describe an actor-critic algorithm that finds an $ε$-optimal solution for the robust policy improvement problem in $\mathcal{O}(1/ε^4)$ iterations. We thus present the first complete solution scheme for robust MDPs with non-rectangular uncertainty sets offering global optimality guarantees. Numerical experiments show that our algorithms compare favorably against state-of-the-art methods. △ Less

Submitted 23 January, 2024; v1 submitted 30 May, 2023; originally announced May 2023.

Comments: comments are welcome

MSC Class: 90C17; 90C26

arXiv:2305.17037 [pdf, other]

Distributionally Robust Linear Quadratic Control

Authors: Bahar Taşkesen, Dan A. Iancu, Çağıl Koçyiğit, Daniel Kuhn

Abstract: Linear-Quadratic-Gaussian (LQG) control is a fundamental control paradigm that is studied in various fields such as engineering, computer science, economics, and neuroscience. It involves controlling a system with linear dynamics and imperfect observations, subject to additive noise, with the goal of minimizing a quadratic cost function for the state and control variables. In this work, we conside… ▽ More Linear-Quadratic-Gaussian (LQG) control is a fundamental control paradigm that is studied in various fields such as engineering, computer science, economics, and neuroscience. It involves controlling a system with linear dynamics and imperfect observations, subject to additive noise, with the goal of minimizing a quadratic cost function for the state and control variables. In this work, we consider a generalization of the discrete-time, finite-horizon LQG problem, where the noise distributions are unknown and belong to Wasserstein ambiguity sets centered at nominal (Gaussian) distributions. The objective is to minimize a worst-case cost across all distributions in the ambiguity set, including non-Gaussian distributions. Despite the added complexity, we prove that a control policy that is linear in the observations is optimal for this problem, as in the classic LQG problem. We propose a numerical solution method that efficiently characterizes this optimal control policy. Our method uses the Frank-Wolfe algorithm to identify the least-favorable distributions within the Wasserstein ambiguity sets and computes the controller's optimal policy using Kalman filter estimation under these distributions. △ Less

Submitted 1 November, 2023; v1 submitted 26 May, 2023; originally announced May 2023.

arXiv:2304.00290 [pdf, other]

PIQP: A Proximal Interior-Point Quadratic Programming Solver

Authors: Roland Schwan, Yuning Jiang, Daniel Kuhn, Colin N. Jones

Abstract: This paper presents PIQP, a high-performance toolkit for solving generic sparse quadratic programs (QP). Combining an infeasible Interior Point Method (IPM) with the Proximal Method of Multipliers (PMM), the algorithm can handle ill-conditioned convex QP problems without the need for linear independence of the constraints. The open-source implementation is written in C++ with interfaces to C, Pyth… ▽ More This paper presents PIQP, a high-performance toolkit for solving generic sparse quadratic programs (QP). Combining an infeasible Interior Point Method (IPM) with the Proximal Method of Multipliers (PMM), the algorithm can handle ill-conditioned convex QP problems without the need for linear independence of the constraints. The open-source implementation is written in C++ with interfaces to C, Python, Matlab, and R leveraging the Eigen3 library. The method uses a pivoting-free factorization routine and allocation-free updates of the problem data, making the solver suitable for embedded applications. The solver is evaluated on the Maros-Mészáros problem set and optimal control problems, demonstrating state-of-the-art performance for both small and large-scale problems, outperforming commercial and open-source solvers. △ Less

Submitted 15 September, 2023; v1 submitted 1 April, 2023; originally announced April 2023.

arXiv:2303.03900 [pdf, other]

New Perspectives on Regularization and Computation in Optimal Transport-Based Distributionally Robust Optimization

Authors: Soroosh Shafieezadeh-Abadeh, Liviu Aolaritei, Florian Dörfler, Daniel Kuhn

Abstract: We study optimal transport-based distributionally robust optimization problems where a fictitious adversary, often envisioned as nature, can choose the distribution of the uncertain problem parameters by resha** a prescribed reference distribution at a finite transportation cost. In this framework, we show that robustification is intimately related to various forms of variation and Lipschitz reg… ▽ More We study optimal transport-based distributionally robust optimization problems where a fictitious adversary, often envisioned as nature, can choose the distribution of the uncertain problem parameters by resha** a prescribed reference distribution at a finite transportation cost. In this framework, we show that robustification is intimately related to various forms of variation and Lipschitz regularization even if the transportation cost function fails to be (some power of) a metric. We also derive conditions for the existence and the computability of a Nash equilibrium between the decision-maker and nature, and we demonstrate numerically that nature's Nash strategy can be viewed as a distribution that is supported on remarkably deceptive adversarial samples. Finally, we identify practically relevant classes of optimal transport-based distributionally robust optimization problems that can be addressed with efficient gradient descent algorithms even if the loss function or the transportation cost function are nonconvex (but not both at the same time). △ Less

Submitted 7 March, 2023; originally announced March 2023.

arXiv:2211.01325 [pdf, ps, other]

Perfect matchings in random sparsifications of Dirac hypergraphs

Authors: Dong Yeap Kang, Tom Kelly, Daniela Kühn, Deryk Osthus, Vincent Pfenninger

Abstract: For all integers $n \geq k > d \geq 1$, let $m_{d}(k,n)$ be the minimum integer $D \geq 0$ such that every $k$-uniform $n$-vertex hypergraph $\mathcal H$ with minimum $d$-degree $δ_{d}(\mathcal H)$ at least $D$ has an optimal matching. For every fixed integer $k \geq 3$, we show that for $n \in k \mathbb{N}$ and $p = Ω(n^{-k+1} \log n)$, if $\mathcal H$ is an $n$-vertex $k$-uniform hypergraph with… ▽ More For all integers $n \geq k > d \geq 1$, let $m_{d}(k,n)$ be the minimum integer $D \geq 0$ such that every $k$-uniform $n$-vertex hypergraph $\mathcal H$ with minimum $d$-degree $δ_{d}(\mathcal H)$ at least $D$ has an optimal matching. For every fixed integer $k \geq 3$, we show that for $n \in k \mathbb{N}$ and $p = Ω(n^{-k+1} \log n)$, if $\mathcal H$ is an $n$-vertex $k$-uniform hypergraph with $δ_{k-1}(\mathcal H) \geq m_{k-1}(k,n)$, then a.a.s.\ its $p$-random subhypergraph $\mathcal H_p$ contains a perfect matching. Moreover, for every fixed integer $d < k$ and $γ> 0$, we show that the same conclusion holds if $\mathcal H$ is an $n$-vertex $k$-uniform hypergraph with $δ_d(\mathcal H) \geq m_{d}(k,n) + γ\binom{n - d}{k - d}$. Both of these results strengthen Johansson, Kahn, and Vu's seminal solution to Shamir's problem and can be viewed as ``robust'' versions of hypergraph Dirac-type results. In addition, we also show that in both cases above, $\mathcal H$ has at least $\exp((1-1/k)n \log n - Θ(n))$ many perfect matchings, which is best possible up to an $\exp(Θ(n))$ factor. △ Less

Submitted 16 April, 2024; v1 submitted 2 November, 2022; originally announced November 2022.

Comments: Final version, to appear in Combinatorica (26 pages + 2 page appendix); Theorem 1.5 was proved in independent work of Pham, Sah, Sawhney, and Simkin (arxiv:2210.03064)

arXiv:2206.14472 [pdf, other]

Thresholds for Latin squares and Steiner triple systems: Bounds within a logarithmic factor

Authors: Dong Yeap Kang, Tom Kelly, Daniela Kühn, Abhishek Methuku, Deryk Osthus

Abstract: We prove that for $n \in \mathbb N$ and an absolute constant $C$, if $p \geq C\log^2 n / n$ and $L_{i,j} \subseteq [n]$ is a random subset of $[n]$ where each $k\in [n]$ is included in $L_{i,j}$ independently with probability $p$ for each $i, j\in [n]$, then asymptotically almost surely there is an order-$n$ Latin square in which the entry in the $i$th row and $j$th column lies in $L_{i,j}$. The p… ▽ More We prove that for $n \in \mathbb N$ and an absolute constant $C$, if $p \geq C\log^2 n / n$ and $L_{i,j} \subseteq [n]$ is a random subset of $[n]$ where each $k\in [n]$ is included in $L_{i,j}$ independently with probability $p$ for each $i, j\in [n]$, then asymptotically almost surely there is an order-$n$ Latin square in which the entry in the $i$th row and $j$th column lies in $L_{i,j}$. The problem of determining the threshold probability for the existence of an order-$n$ Latin square was raised independently by Johansson, by Luria and Simkin, and by Casselgren and H{ä}ggkvist; our result provides an upper bound which is tight up to a factor of $\log n$ and strengthens the bound recently obtained by Sah, Sawhney, and Simkin. We also prove analogous results for Steiner triple systems and $1$-factorizations of complete graphs, and moreover, we show that each of these thresholds is at most the threshold for the existence of a $1$-factorization of a nearly complete regular bipartite graph. △ Less

Submitted 26 March, 2023; v1 submitted 29 June, 2022; originally announced June 2022.

Comments: 32 pages, 1 figure. Final version, to appear in Transactions of the AMS

arXiv:2206.13374 [pdf, other]

Stability Verification of Neural Network Controllers using Mixed-Integer Programming

Authors: Roland Schwan, Colin N. Jones, Daniel Kuhn

Abstract: We propose a framework for the stability verification of Mixed-Integer Linear Programming (MILP) representable control policies. This framework compares a fixed candidate policy, which admits an efficient parameterization and can be evaluated at a low computational cost, against a fixed baseline policy, which is known to be stable but expensive to evaluate. We provide sufficient conditions for the… ▽ More We propose a framework for the stability verification of Mixed-Integer Linear Programming (MILP) representable control policies. This framework compares a fixed candidate policy, which admits an efficient parameterization and can be evaluated at a low computational cost, against a fixed baseline policy, which is known to be stable but expensive to evaluate. We provide sufficient conditions for the closed-loop stability of the candidate policy in terms of the worst-case approximation error with respect to the baseline policy, and we show that these conditions can be checked by solving a Mixed-Integer Quadratic Program (MIQP). Additionally, we demonstrate that an outer and inner approximation of the stability region of the candidate policy can be computed by solving an MILP. The proposed framework is sufficiently general to accommodate a broad range of candidate policies including ReLU Neural Networks (NNs), optimal solution maps of parametric quadratic programs, and Model Predictive Control (MPC) policies. We also present an open-source toolbox in Python based on the proposed framework, which allows for the easy verification of custom NN architectures and MPC formulations. We showcase the flexibility and reliability of our framework in the context of a DC-DC power converter case study and investigate its computational complexity. △ Less

Submitted 31 May, 2023; v1 submitted 27 June, 2022; originally announced June 2022.

arXiv:2206.00231 [pdf, other]

On Approximations of Data-Driven Chance Constrained Programs over Wasserstein Balls

Authors: Zhi Chen, Daniel Kuhn, Wolfram Wiesemann

Abstract: Distributionally robust chance constrained programs minimize a deterministic cost function subject to the satisfaction of one or more safety conditions with high probability, given that the probability distribution of the uncertain problem parameters affecting the safety condition(s) is only known to belong to some ambiguity set. We study three popular approximation schemes for distributionally ro… ▽ More Distributionally robust chance constrained programs minimize a deterministic cost function subject to the satisfaction of one or more safety conditions with high probability, given that the probability distribution of the uncertain problem parameters affecting the safety condition(s) is only known to belong to some ambiguity set. We study three popular approximation schemes for distributionally robust chance constrained programs over Wasserstein balls, where the ambiguity set contains all probability distributions within a certain Wasserstein distance to a reference distribution. The first approximation replaces the chance constraint with a bound on the conditional value-at-risk, the second approximation decouples different safety conditions via Bonferroni's inequality, and the third approximation restricts the expected violation of the safety condition(s) so that the chance constraint is satisfied. We show that the conditional value-at-risk approximation can be characterized as a tight convex approximation, which complements earlier findings on classical (non-robust) chance constraints, and we offer a novel interpretation in terms of transportation savings. We also show that the three approximations can perform arbitrarily poorly in data-driven settings, and that they are generally incomparable with each other. △ Less

Submitted 20 November, 2022; v1 submitted 1 June, 2022; originally announced June 2022.

Comments: arXiv admin note: substantial text overlap with arXiv:1809.00210

arXiv:2205.15049 [pdf, other]

Metrizing Fairness

Authors: Yves Rychener, Bahar Taskesen, Daniel Kuhn

Abstract: We study supervised learning problems that have significant effects on individuals from two demographic groups, and we seek predictors that are fair with respect to a group fairness criterion such as statistical parity (SP). A predictor is SP-fair if the distributions of predictions within the two groups are close in Kolmogorov distance, and fairness is achieved by penalizing the dissimilarity of… ▽ More We study supervised learning problems that have significant effects on individuals from two demographic groups, and we seek predictors that are fair with respect to a group fairness criterion such as statistical parity (SP). A predictor is SP-fair if the distributions of predictions within the two groups are close in Kolmogorov distance, and fairness is achieved by penalizing the dissimilarity of these two distributions in the objective function of the learning problem. In this paper, we identify conditions under which hard SP constraints are guaranteed to improve predictive accuracy. We also showcase conceptual and computational benefits of measuring unfairness with integral probability metrics (IPMs) other than the Kolmogorov distance. Conceptually, we show that the generator of any IPM can be interpreted as a family of utility functions and that unfairness with respect to this IPM arises if individuals in the two demographic groups have diverging expected utilities. We also prove that the unfairness-regularized prediction loss admits unbiased gradient estimators, which are constructed from random mini-batches of training samples, if unfairness is measured by the squared $\mathcal L^2$-distance or by a squared maximum mean discrepancy. In this case, the fair learning problem is susceptible to efficient stochastic gradient descent (SGD) algorithms. Numerical experiments on synthetic and real data show that these SGD algorithms outperform state-of-the-art methods for fair learning in that they achieve superior accuracy-unfairness trade-offs -- sometimes orders of magnitude faster. △ Less

Submitted 11 June, 2024; v1 submitted 30 May, 2022; originally announced May 2022.

arXiv:2203.01161 [pdf, ps, other]

Discrete Optimal Transport with Independent Marginals is #P-Hard

Authors: Bahar Taşkesen, Soroosh Shafieezadeh-Abadeh, Daniel Kuhn, Karthik Natarajan

Abstract: We study the computational complexity of the optimal transport problem that evaluates the Wasserstein distance between the distributions of two K-dimensional discrete random vectors. The best known algorithms for this problem run in polynomial time in the maximum of the number of atoms of the two distributions. However, if the components of either random vector are independent, then this number ca… ▽ More We study the computational complexity of the optimal transport problem that evaluates the Wasserstein distance between the distributions of two K-dimensional discrete random vectors. The best known algorithms for this problem run in polynomial time in the maximum of the number of atoms of the two distributions. However, if the components of either random vector are independent, then this number can be exponential in K even though the size of the problem description scales linearly with K. We prove that the described optimal transport problem is #P-hard even if all components of the first random vector are independent uniform Bernoulli random variables, while the second random vector has merely two atoms, and even if only approximate solutions are sought. We also develop a dynamic programming-type algorithm that approximates the Wasserstein distance in pseudo-polynomial time when the components of the first random vector follow arbitrary independent discrete distributions, and we identify special problem instances that can be solved exactly in strongly polynomial time. △ Less

Submitted 14 October, 2022; v1 submitted 2 March, 2022; originally announced March 2022.

arXiv:2112.09959 [pdf, other]

Mean-Covariance Robust Risk Measurement

Authors: Viet Anh Nguyen, Soroosh Shafiee, Damir Filipović, Daniel Kuhn

Abstract: We introduce a universal framework for mean-covariance robust risk measurement and portfolio optimization. We model uncertainty in terms of the Gelbrich distance on the mean-covariance space, along with prior structural information about the population distribution. Our approach is related to the theory of optimal transport and exhibits superior statistical and computational properties than existi… ▽ More We introduce a universal framework for mean-covariance robust risk measurement and portfolio optimization. We model uncertainty in terms of the Gelbrich distance on the mean-covariance space, along with prior structural information about the population distribution. Our approach is related to the theory of optimal transport and exhibits superior statistical and computational properties than existing models. We find that, for a large class of risk measures, mean-covariance robust portfolio optimization boils down to the Markowitz model, subject to a regularization term given in closed form. This includes the finance standards, value-at-risk and conditional value-at-risk, and can be solved highly efficiently. △ Less

Submitted 30 November, 2023; v1 submitted 18 December, 2021; originally announced December 2021.

arXiv:2110.06181 [pdf, ps, other]

Solution to a problem of Erdős on the chromatic index of hypergraphs with bounded codegree

Authors: Dong Yeap Kang, Tom Kelly, Daniela Kühn, Abhishek Methuku, Deryk Osthus

Abstract: In 1977, Erdős asked the following question: for any integers $t,n \in \mathbb{N}$, if $G_1 , \dots , G_n$ are complete graphs such that each $G_i$ has at most $n$ vertices and every pair of them shares at most $t$ vertices, what is the largest possible chromatic number of the union $\bigcup_{i=1}^{n} G_i$? The equivalent dual formulation of this question asks for the largest chromatic index of an… ▽ More In 1977, Erdős asked the following question: for any integers $t,n \in \mathbb{N}$, if $G_1 , \dots , G_n$ are complete graphs such that each $G_i$ has at most $n$ vertices and every pair of them shares at most $t$ vertices, what is the largest possible chromatic number of the union $\bigcup_{i=1}^{n} G_i$? The equivalent dual formulation of this question asks for the largest chromatic index of an $n$-vertex hypergraph with maximum degree at most $n$ and codegree at most $t$. For the case $t = 1$, Erdős, Faber, and Lovász famously conjectured that the answer is $n$, which was recently proved by the authors for all sufficiently large $n$. In this paper, we answer this question of Erdős for $t \geq 2$ in a strong sense, by proving that every $n$-vertex hypergraph with maximum degree at most $(1-o(1))tn$ and codegree at most $t$ has chromatic index at most $tn$ for any $t,n \in \mathbb{N}$. Moreover, equality holds if and only if the hypergraph is a $t$-fold projective plane of order $k$, where $n = k^2 + k + 1$. Thus, for every $t \in \mathbb N$, this bound is best possible for infinitely many integers $n$. This result also holds for the list chromatic index. △ Less

Submitted 12 October, 2021; originally announced October 2021.

Comments: 23 pages

arXiv:2110.01570 [pdf, ps, other]

Hypergraph regularity and random sampling

Authors: Felix Joos, Jaehoon Kim, Daniela Kühn, Deryk Osthus

Abstract: Suppose a $k$-uniform hypergraph $H$ that satisfies a certain regularity instance (that is, there is a partition of $H$ given by the hypergraph regularity lemma into a bounded number of quasirandom subhypergraphs of prescribed densities). We prove that with high probability a large enough uniform random sample of the vertex set of $H$ also admits the same regularity instance. Here the crucial feat… ▽ More Suppose a $k$-uniform hypergraph $H$ that satisfies a certain regularity instance (that is, there is a partition of $H$ given by the hypergraph regularity lemma into a bounded number of quasirandom subhypergraphs of prescribed densities). We prove that with high probability a large enough uniform random sample of the vertex set of $H$ also admits the same regularity instance. Here the crucial feature is that the error term measuring the quasirandomness of the subhypergraphs requires only an arbitrarily small additive correction. This has applications to combinatorial property testing. The graph case of the sampling result was proved by Alon, Fischer, Newman and Shapira. △ Less

Submitted 11 August, 2022; v1 submitted 4 October, 2021; originally announced October 2021.

Comments: 49 pages; we split our paper arXiv:1707.03303 into two, this one and the new version of arXiv:1707.03303. Final version, to appear in Random Structures and Algorithms

arXiv:2109.11438 [pdf, ps, other]

A special case of Vu's conjecture: Coloring nearly disjoint graphs of bounded maximum degree

Authors: Tom Kelly, Daniela Kühn, Deryk Osthus

Abstract: A collection of graphs is \textit{nearly disjoint} if every pair of them intersects in at most one vertex. We prove that if $G_1, \dots, G_m$ are nearly disjoint graphs of maximum degree at most $D$, then the following holds. For every fixed $C$, if each vertex $v \in \bigcup_{i=1}^m V(G_i)$ is contained in at most $C$ of the graphs $G_1, \dots, G_m$, then the (list) chromatic number of… ▽ More A collection of graphs is \textit{nearly disjoint} if every pair of them intersects in at most one vertex. We prove that if $G_1, \dots, G_m$ are nearly disjoint graphs of maximum degree at most $D$, then the following holds. For every fixed $C$, if each vertex $v \in \bigcup_{i=1}^m V(G_i)$ is contained in at most $C$ of the graphs $G_1, \dots, G_m$, then the (list) chromatic number of $\bigcup_{i=1}^m G_i$ is at most $D + o(D)$. This result confirms a special case of a conjecture of Vu and generalizes Kahn's bound on the list chromatic index of linear uniform hypergraphs of bounded maximum degree. In fact, this result holds for the correspondence (or DP) chromatic number and thus implies a recent result of Molloy, and we derive this result from a more general list coloring result in the setting of `color degrees' that also implies a result of Reed and Sudakov. △ Less

Submitted 28 October, 2023; v1 submitted 23 September, 2021; originally announced September 2021.

Comments: 16 pages with one-page appendix; final version, to appear in Combinatorics, Probability, and Computing

arXiv:2106.13733 [pdf, other]

Graph and hypergraph colouring via nibble methods: A survey

Authors: Dong Yeap Kang, Tom Kelly, Daniela Kühn, Abhishek Methuku, Deryk Osthus

Abstract: This paper provides a survey of methods, results, and open problems on graph and hypergraph colourings, with a particular emphasis on semi-random `nibble' methods. We also give a detailed sketch of some aspects of the recent proof of the Erdős-Faber-Lovász conjecture. This paper provides a survey of methods, results, and open problems on graph and hypergraph colourings, with a particular emphasis on semi-random `nibble' methods. We also give a detailed sketch of some aspects of the recent proof of the Erdős-Faber-Lovász conjecture. △ Less

Submitted 16 November, 2021; v1 submitted 25 June, 2021; originally announced June 2021.

Comments: Final version, to appear in the proceedings of the 8th European Congress of Mathematics; 33 pages, 3 figures

arXiv:2106.06741 [pdf, other]

Distributionally Robust Optimization with Markovian Data

Authors: Mengmeng Li, Tobias Sutter, Daniel Kuhn

Abstract: We study a stochastic program where the probability distribution of the uncertain problem parameters is unknown and only indirectly observed via finitely many correlated samples generated by an unknown Markov chain with $d$ states. We propose a data-driven distributionally robust optimization model to estimate the problem's objective function and optimal solution. By leveraging results from large… ▽ More We study a stochastic program where the probability distribution of the uncertain problem parameters is unknown and only indirectly observed via finitely many correlated samples generated by an unknown Markov chain with $d$ states. We propose a data-driven distributionally robust optimization model to estimate the problem's objective function and optimal solution. By leveraging results from large deviations theory, we derive statistical guarantees on the quality of these estimators. The underlying worst-case expectation problem is nonconvex and involves $\mathcal O(d^2)$ decision variables. Thus, it cannot be solved efficiently for large $d$. By exploiting the structure of this problem, we devise a customized Frank-Wolfe algorithm with convex direction-finding subproblems of size $\mathcal O(d)$. We prove that this algorithm finds a stationary point efficiently under mild conditions. The efficiency of the method is predicated on a dimensionality reduction enabled by a dual reformulation. Numerical experiments indicate that our approach has better computational and statistical properties than the state-of-the-art methods. △ Less

Submitted 12 June, 2021; originally announced June 2021.

Comments: 20 pages

arXiv:2106.04443 [pdf, other]

Robust Generalization despite Distribution Shift via Minimum Discriminating Information

Authors: Tobias Sutter, Andreas Krause, Daniel Kuhn

Abstract: Training models that perform well under distribution shifts is a central challenge in machine learning. In this paper, we introduce a modeling framework where, in addition to training data, we have partial structural knowledge of the shifted test distribution. We employ the principle of minimum discriminating information to embed the available prior knowledge, and use distributionally robust optim… ▽ More Training models that perform well under distribution shifts is a central challenge in machine learning. In this paper, we introduce a modeling framework where, in addition to training data, we have partial structural knowledge of the shifted test distribution. We employ the principle of minimum discriminating information to embed the available prior knowledge, and use distributionally robust optimization to account for uncertainty due to the limited samples. By leveraging large deviation results, we obtain explicit generalization bounds with respect to the unknown shifted distribution. Lastly, we demonstrate the versatility of our framework by demonstrating it on two rather distinct applications: (1) training classifiers on systematically biased data and (2) off-policy evaluation in Markov Decision Processes. △ Less

Submitted 26 October, 2021; v1 submitted 8 June, 2021; originally announced June 2021.

Comments: 23 pages, 4 figures

Journal ref: NeurIPS 2021

arXiv:2106.00322 [pdf, other]

Sequential Domain Adaptation by Synthesizing Distributionally Robust Experts

Authors: Bahar Taskesen, Man-Chung Yue, Jose Blanchet, Daniel Kuhn, Viet Anh Nguyen

Abstract: Least squares estimators, when trained on a few target domain samples, may predict poorly. Supervised domain adaptation aims to improve the predictive accuracy by exploiting additional labeled training samples from a source distribution that is close to the target distribution. Given available data, we investigate novel strategies to synthesize a family of least squares estimator experts that are… ▽ More Least squares estimators, when trained on a few target domain samples, may predict poorly. Supervised domain adaptation aims to improve the predictive accuracy by exploiting additional labeled training samples from a source distribution that is close to the target distribution. Given available data, we investigate novel strategies to synthesize a family of least squares estimator experts that are robust with regard to moment conditions. When these moment conditions are specified using Kullback-Leibler or Wasserstein-type divergences, we can find the robust estimators efficiently using convex optimization. We use the Bernstein online aggregation algorithm on the proposed family of robust experts to generate predictions for the sequential stream of target test samples. Numerical experiments on real data show that the robust strategies may outperform non-robust interpolations of the empirical least squares estimators. △ Less

Submitted 1 June, 2021; originally announced June 2021.

arXiv:2105.00760 [pdf, ps, other]

A Unified Theory of Robust and Distributionally Robust Optimization via the Primal-Worst-Equals-Dual-Best Principle

Authors: Jianzhe Zhen, Daniel Kuhn, Wolfram Wiesemann

Abstract: Robust and distributionally robust optimization are modeling paradigms for decision-making under uncertainty where the uncertain parameters are only known to reside in an uncertainty set or are governed by any probability distribution from within an ambiguity set, respectively, and a decision is sought that minimizes a cost function under the most adverse outcome of the uncertainty. In this paper,… ▽ More Robust and distributionally robust optimization are modeling paradigms for decision-making under uncertainty where the uncertain parameters are only known to reside in an uncertainty set or are governed by any probability distribution from within an ambiguity set, respectively, and a decision is sought that minimizes a cost function under the most adverse outcome of the uncertainty. In this paper, we develop a rigorous and general theory of robust and distributionally robust nonlinear optimization using the language of convex analysis. Our framework is based on a generalized `primal-worst-equals-dual-best' principle that establishes strong duality between a semi-infinite primal worst and a non-convex dual best formulation, both of which admit finite convex reformulations. This principle offers an alternative formulation for robust optimization problems that obviates the need to mobilize the machinery of abstract semi-infinite duality theory to prove strong duality in distributionally robust optimization. We illustrate the modeling power of our approach through convex reformulations for distributionally robust optimization problems whose ambiguity sets are defined through general optimal transport distances, which generalize earlier results for Wasserstein ambiguity sets. △ Less

Submitted 19 July, 2023; v1 submitted 3 May, 2021; originally announced May 2021.

Comments: Previous title: Mathematical Foundations of Robust and Distributionally Robust Optimization

arXiv:2103.06263 [pdf, other]

Semi-Discrete Optimal Transport: Hardness, Regularization and Numerical Solution

Authors: Bahar Taskesen, Soroosh Shafieezadeh-Abadeh, Daniel Kuhn

Abstract: Semi-discrete optimal transport problems, which evaluate the Wasserstein distance between a discrete and a generic (possibly non-discrete) probability measure, are believed to be computationally hard. Even though such problems are ubiquitous in statistics, machine learning and computer vision, however, this perception has not yet received a theoretical justification. To fill this gap, we prove tha… ▽ More Semi-discrete optimal transport problems, which evaluate the Wasserstein distance between a discrete and a generic (possibly non-discrete) probability measure, are believed to be computationally hard. Even though such problems are ubiquitous in statistics, machine learning and computer vision, however, this perception has not yet received a theoretical justification. To fill this gap, we prove that computing the Wasserstein distance between a discrete probability measure supported on two points and the Lebesgue measure on the standard hypercube is already #P-hard. This insight prompts us to seek approximate solutions for semi-discrete optimal transport problems. We thus perturb the underlying transportation cost with an additive disturbance governed by an ambiguous probability distribution, and we introduce a distributionally robust dual optimal transport problem whose objective function is smoothed with the most adverse disturbance distributions from within a given ambiguity set. We further show that smoothing the dual objective function is equivalent to regularizing the primal objective function, and we identify several ambiguity sets that give rise to several known and new regularization schemes. As a byproduct, we discover an intimate relation between semi-discrete optimal transport problems and discrete choice models traditionally studied in psychology and economics. To solve the regularized optimal transport problems efficiently, we use a stochastic gradient descent algorithm with imprecise stochastic gradient oracles. A new convergence analysis reveals that this algorithm improves the best known convergence guarantee for semi-discrete optimal transport problems with entropic regularizers. △ Less

Submitted 29 April, 2022; v1 submitted 10 March, 2021; originally announced March 2021.

arXiv:2103.05478 [pdf, other]

Small errors in random zeroth-order optimization are imaginary

Authors: Wouter Jongeneel, Man-Chung Yue, Daniel Kuhn

Abstract: Most zeroth-order optimization algorithms mimic a first-order algorithm but replace the gradient of the objective function with some gradient estimator that can be computed from a small number of function evaluations. This estimator is constructed randomly, and its expectation matches the gradient of a smooth approximation of the objective function whose quality improves as the underlying smoothin… ▽ More Most zeroth-order optimization algorithms mimic a first-order algorithm but replace the gradient of the objective function with some gradient estimator that can be computed from a small number of function evaluations. This estimator is constructed randomly, and its expectation matches the gradient of a smooth approximation of the objective function whose quality improves as the underlying smoothing parameter $δ$ is reduced. Gradient estimators requiring a smaller number of function evaluations are preferable from a computational point of view. While estimators based on a single function evaluation can be obtained by use of the divergence theorem from vector calculus, their variance explodes as $δ$ tends to $0$. Estimators based on multiple function evaluations, on the other hand, suffer from numerical cancellation when $δ$ tends to $0$. To combat both effects simultaneously, we extend the objective function to the complex domain and construct a gradient estimator that evaluates the objective at a complex point whose coordinates have small imaginary parts of the order $δ$. As this estimator requires only one function evaluation, it is immune to cancellation. In addition, its variance remains bounded as $δ$ tends to $0$. We prove that zeroth-order algorithms that use our estimator offer the same theoretical convergence guarantees as the state-of-the-art methods. Numerical experiments suggest, however, that they often converge faster in practice. △ Less

Submitted 19 March, 2024; v1 submitted 9 March, 2021; originally announced March 2021.

Comments: Final version (33 pages), to appear in the SIAM Journal on Optimization

MSC Class: 65D25; 65G50; 65K05; 65Y04; 65Y20; 90C56

arXiv:2103.03805 [pdf, other]

doi 10.1109/LCSYS.2021.3072814

Topological Linear System Identification via Moderate Deviations Theory

Authors: Wouter Jongeneel, Tobias Sutter, Daniel Kuhn

Abstract: Two dynamical systems are topologically equivalent when their phase-portraits can be morphed into each other by a homeomorphic coordinate transformation on the state space. The induced equivalence classes capture qualitative properties such as stability or the oscillatory nature of the state trajectories, for example. In this paper we develop a method to learn the topological class of an unknown s… ▽ More Two dynamical systems are topologically equivalent when their phase-portraits can be morphed into each other by a homeomorphic coordinate transformation on the state space. The induced equivalence classes capture qualitative properties such as stability or the oscillatory nature of the state trajectories, for example. In this paper we develop a method to learn the topological class of an unknown stable system from a single trajectory of finitely many state observations. Using a moderate deviations principle for the least squares estimator of the unknown system matrix $θ$, we prove that the probability of misclassification decays exponentially with the number of observations at a rate that is proportional to the square of the smallest singular value of $θ$. △ Less

Submitted 9 April, 2021; v1 submitted 5 March, 2021; originally announced March 2021.

Comments: updated Section 3.A

arXiv:2103.02806 [pdf, other]

A Planner-Trader Decomposition for Multi-Market Hydro Scheduling

Authors: Kilian Schindler, Napat Rujeerapaiboon, Daniel Kuhn, Wolfram Wiesemann

Abstract: Peak/off-peak spreads on European electricity forward and spot markets are eroding due to the ongoing nuclear phaseout in Germany and the steady growth in photovoltaic capacity. The reduced profitability of peak/off-peak arbitrage forces hydropower producers to recover part of their original profitability on the reserve markets. We propose a bi-layer stochastic programming framework for the optima… ▽ More Peak/off-peak spreads on European electricity forward and spot markets are eroding due to the ongoing nuclear phaseout in Germany and the steady growth in photovoltaic capacity. The reduced profitability of peak/off-peak arbitrage forces hydropower producers to recover part of their original profitability on the reserve markets. We propose a bi-layer stochastic programming framework for the optimal operation of a fleet of interconnected hydropower plants that sells energy on both the spot and the reserve markets. The outer layer (the planner's problem) optimizes end-of-day reservoir filling levels over one year, whereas the inner layer (the trader's problem) selects optimal hourly market bids within each day. Using an information restriction whereby the planner prescribes the end-of-day reservoir targets one day in advance, we prove that the trader's problem simplifies from an infinite-dimensional stochastic program with 25 stages to a finite two-stage stochastic program with only two scenarios. Substituting this reformulation back into the outer layer and approximating the reservoir targets by affine decision rules allows us to simplify the planner's problem from an infinite-dimensional stochastic program with 365 stages to a two-stage stochastic program that can conveniently be solved via the sample average approximation. Numerical experiments based on a cascade in the Salzburg region of Austria demonstrate the effectiveness of the suggested framework. △ Less

Submitted 2 September, 2022; v1 submitted 3 March, 2021; originally announced March 2021.

MSC Class: 90C15; 90C17; 90C90

arXiv:2102.03664 [pdf, other]

doi 10.1109/TAC.2022.3213770

Efficient Learning of a Linear Dynamical System with Stability Guarantees

Authors: Wouter Jongeneel, Tobias Sutter, Daniel Kuhn

Abstract: We propose a principled method for projecting an arbitrary square matrix to the non-convex set of asymptotically stable matrices. Leveraging ideas from large deviations theory, we show that this projection is optimal in an information-theoretic sense and that it simply amounts to shifting the initial matrix by an optimal linear quadratic feedback gain, which can be computed exactly and highly effi… ▽ More We propose a principled method for projecting an arbitrary square matrix to the non-convex set of asymptotically stable matrices. Leveraging ideas from large deviations theory, we show that this projection is optimal in an information-theoretic sense and that it simply amounts to shifting the initial matrix by an optimal linear quadratic feedback gain, which can be computed exactly and highly efficiently by solving a standard linear quadratic regulator problem. The proposed approach allows us to learn the system matrix of a stable linear dynamical system from a single trajectory of correlated state observations. The resulting estimator is guaranteed to be stable and offers explicit statistical bounds on the estimation error. △ Less

Submitted 13 June, 2022; v1 submitted 6 February, 2021; originally announced February 2021.

Comments: Exposition has been updated

Journal ref: IEEE Transactions on Automatic Control (Volume: 68, Issue: 5, May 2023)

arXiv:2101.04698 [pdf, ps, other]

A proof of the Erdős-Faber-Lovász conjecture

Authors: Dong Yeap Kang, Tom Kelly, Daniela Kühn, Abhishek Methuku, Deryk Osthus

Abstract: The Erdős-Faber-Lovász conjecture (posed in 1972) states that the chromatic index of any linear hypergraph on $n$ vertices is at most $n$. In this paper, we prove this conjecture for every large $n$. We also provide stability versions of this result, which confirm a prediction of Kahn. The Erdős-Faber-Lovász conjecture (posed in 1972) states that the chromatic index of any linear hypergraph on $n$ vertices is at most $n$. In this paper, we prove this conjecture for every large $n$. We also provide stability versions of this result, which confirm a prediction of Kahn. △ Less

Submitted 25 January, 2023; v1 submitted 12 January, 2021; originally announced January 2021.

Comments: 47 pages, 3 figures. Final version, to appear in the Annals of Mathematics

arXiv:2010.14158 [pdf, ps, other]

doi 10.1112/plms.12480

Path decompositions of tournaments

Authors: António Girão, Bertille Granet, Daniela Kühn, Allan Lo, Deryk Osthus

Abstract: In 1976, Alspach, Mason, and Pullman conjectured that any tournament $T$ of even order can be decomposed into exactly ${\rm ex}(T)$ paths, where ${\rm ex}(T):= \frac{1}{2}\sum_{v\in V(T)}|d_T^+(v)-d_T^-(v)|$. We prove this conjecture for all sufficiently large tournaments. We also prove an asymptotically optimal result for tournaments of odd order. In 1976, Alspach, Mason, and Pullman conjectured that any tournament $T$ of even order can be decomposed into exactly ${\rm ex}(T)$ paths, where ${\rm ex}(T):= \frac{1}{2}\sum_{v\in V(T)}|d_T^+(v)-d_T^-(v)|$. We prove this conjecture for all sufficiently large tournaments. We also prove an asymptotically optimal result for tournaments of odd order. △ Less

Submitted 28 July, 2022; v1 submitted 27 October, 2020; originally announced October 2020.

Comments: 73 pages, 2 figures; final version, to appear in the Proceedings of the London Mathematical Society

Journal ref: Proc. London Math. Soc., 126 (2023): 429-517

arXiv:2010.06606 [pdf, other]

A Pareto Dominance Principle for Data-Driven Optimization

Authors: Tobias Sutter, Bart P. G. Van Parys, Daniel Kuhn

Abstract: We propose a statistically optimal approach to construct data-driven decisions for stochastic optimization problems. Fundamentally, a data-driven decision is simply a function that maps the available training data to a feasible action. It can always be expressed as the minimizer of a surrogate optimization model constructed from the data. The quality of a data-driven decision is measured by its ou… ▽ More We propose a statistically optimal approach to construct data-driven decisions for stochastic optimization problems. Fundamentally, a data-driven decision is simply a function that maps the available training data to a feasible action. It can always be expressed as the minimizer of a surrogate optimization model constructed from the data. The quality of a data-driven decision is measured by its out-of-sample risk. An additional quality measure is its out-of-sample disappointment, which we define as the probability that the out-of-sample risk exceeds the optimal value of the surrogate optimization model. An ideal data-driven decision should minimize the out-of-sample risk simultaneously with respect to every conceivable probability measure as the true measure is unkown. Unfortunately, such ideal data-driven decisions are generally unavailable. This prompts us to seek data-driven decisions that minimize the in-sample risk subject to an upper bound on the out-of-sample disappointment. We prove that such Pareto-dominant data-driven decisions exist under conditions that allow for interesting applications: the unknown data-generating probability measure must belong to a parametric ambiguity set, and the corresponding parameters must admit a sufficient statistic that satisfies a large deviation principle. We can further prove that the surrogate optimization model must be a distributionally robust optimization problem constructed from the sufficient statistic and the rate function of its large deviation principle. Hence the optimal method for map** data to decisions is to solve a distributionally robust optimization model. Maybe surprisingly, this result holds even when the training data is non-i.i.d. Our analysis reveals how the structural properties of the data-generating stochastic process impact the shape of the ambiguity set underlying the optimal distributionally robust model. △ Less

Submitted 14 December, 2023; v1 submitted 13 October, 2020; originally announced October 2020.

Comments: 55 pages

arXiv:2010.04183 [pdf, ps, other]

New bounds on the size of Nearly Perfect Matchings in almost regular hypergraphs

Authors: Dong Yeap Kang, Daniela Kühn, Abhishek Methuku, Deryk Osthus

Abstract: Let $H$ be a $k$-uniform $D$-regular simple hypergraph on $N$ vertices. Based on an analysis of the Rödl nibble, Alon, Kim and Spencer (1997) proved that if $k \ge 3$, then $H$ contains a matching covering all but at most $ND^{-1/(k-1)+o(1)}$ vertices, and asked whether this bound is tight. In this paper we improve their bound by showing that for all $k > 3$, $H$ contains a matching covering all b… ▽ More Let $H$ be a $k$-uniform $D$-regular simple hypergraph on $N$ vertices. Based on an analysis of the Rödl nibble, Alon, Kim and Spencer (1997) proved that if $k \ge 3$, then $H$ contains a matching covering all but at most $ND^{-1/(k-1)+o(1)}$ vertices, and asked whether this bound is tight. In this paper we improve their bound by showing that for all $k > 3$, $H$ contains a matching covering all but at most $ND^{-1/(k-1)-η}$ vertices for some $η= Θ(k^{-3}) > 0$, when $N$ and $D$ are sufficiently large. Our approach consists of showing that the Rödl nibble process not only constructs a large matching but it also produces many well-distributed `augmenting stars' which can then be used to significantly improve the matching constructed by the Rödl nibble process. Based on this, we also improve the results of Kostochka and Rödl (1998) and Vu (2000) on the size of matchings in almost regular hypergraphs with small codegree. As a consequence, we improve the best known bounds on the size of large matchings in combinatorial designs with general parameters. Finally, we improve the bounds of Molloy and Reed (2000) on the chromatic index of hypergraphs with small codegree (which can be applied to improve the best known bounds on the chromatic index of Steiner triple systems and more general designs). △ Less

Submitted 8 October, 2020; originally announced October 2020.

Comments: 35 pages, 1 figure

arXiv:2008.00926 [pdf, ps, other]

Extremal aspects of graph and hypergraph decomposition problems

Authors: Stefan Glock, Daniela Kühn, Deryk Osthus

Abstract: We survey recent advances in the theory of graph and hypergraph decompositions, with a focus on extremal results involving minimum degree conditions. We also collect a number of intriguing open problems, and formulate new ones. We survey recent advances in the theory of graph and hypergraph decompositions, with a focus on extremal results involving minimum degree conditions. We also collect a number of intriguing open problems, and formulate new ones. △ Less

Submitted 25 June, 2021; v1 submitted 3 August, 2020; originally announced August 2020.

Comments: final version as appearing in Surveys in Combinatorics 2021

arXiv:2007.02891 [pdf, ps, other]

Hamiltonicity of random subgraphs of the hypercube

Authors: Padraig Condon, Alberto Espuny Díaz, António Girão, Daniela Kühn, Deryk Osthus

Abstract: We study Hamiltonicity in random subgraphs of the hypercube $\mathcal{Q}^n$. Our first main theorem is an optimal hitting time result. Consider the random process which includes the edges of $\mathcal{Q}^n$ according to a uniformly chosen random ordering. Then, with high probability, as soon as the graph produced by this process has minimum degree $2k$, it contains $k$ edge-disjoint Hamilton cycle… ▽ More We study Hamiltonicity in random subgraphs of the hypercube $\mathcal{Q}^n$. Our first main theorem is an optimal hitting time result. Consider the random process which includes the edges of $\mathcal{Q}^n$ according to a uniformly chosen random ordering. Then, with high probability, as soon as the graph produced by this process has minimum degree $2k$, it contains $k$ edge-disjoint Hamilton cycles, for any fixed $k\in\mathbb{N}$. Secondly, we obtain a perturbation result: if $H\subseteq\mathcal{Q}^n$ satisfies $δ(H)\geqαn$ with $α>0$ fixed and we consider a random binomial subgraph $\mathcal{Q}^n_p$ of $\mathcal{Q}^n$ with $p\in(0,1]$ fixed, then with high probability $H\cup\mathcal{Q}^n_p$ contains $k$ edge-disjoint Hamilton cycles, for any fixed $k\in\mathbb{N}$. In particular, both results resolve a long standing conjecture, posed e.g. by Bollobás, that the threshold probability for Hamiltonicity in the random binomial subgraph of the hypercube equals $1/2$. Our techniques also show that, with high probability, for all fixed $p\in(0,1]$ the graph $\mathcal{Q}^n_p$ contains an almost spanning cycle. Our methods involve branching processes, the Rödl nibble, and absorption. △ Less

Submitted 13 August, 2022; v1 submitted 6 July, 2020; originally announced July 2020.

Comments: Final version, to appear in Memoirs of the AMS

arXiv:2007.00395 [pdf, ps, other]

Almost all optimally coloured complete graphs contain a rainbow Hamilton path

Authors: Stephen Gould, Tom Kelly, Daniela Kühn, Deryk Osthus

Abstract: A subgraph $H$ of an edge-coloured graph is called rainbow if all of the edges of $H$ have different colours. In 1989, Andersen conjectured that every proper edge-colouring of $K_{n}$ admits a rainbow path of length $n-2$. We show that almost all optimal edge-colourings of $K_{n}$ admit both (i) a rainbow Hamilton path and (ii) a rainbow cycle using all of the colours. This result demonstrates tha… ▽ More A subgraph $H$ of an edge-coloured graph is called rainbow if all of the edges of $H$ have different colours. In 1989, Andersen conjectured that every proper edge-colouring of $K_{n}$ admits a rainbow path of length $n-2$. We show that almost all optimal edge-colourings of $K_{n}$ admit both (i) a rainbow Hamilton path and (ii) a rainbow cycle using all of the colours. This result demonstrates that Andersen's Conjecture holds for almost all optimal edge-colourings of $K_{n}$ and answers a recent question of Ferber, Jain, and Sudakov. Our result also has applications to the existence of transversals in random symmetric Latin squares. △ Less

Submitted 21 April, 2022; v1 submitted 1 July, 2020; originally announced July 2020.

Comments: 30 pages, 5 figures. Final version, to appear in Journal of Combinatorial Theory, Series B

arXiv:2005.06042 [pdf, other]

doi 10.1287/msom.2022.0154

Reliable Frequency Regulation through Vehicle-to-Grid: Encoding Legislation with Robust Constraints

Authors: Dirk Lauinger, François Vuille, Daniel Kuhn

Abstract: Problem definition: Vehicle-to-grid increases the low utilization rate of privately owned electric vehicles by making their batteries available to electricity grids. We formulate a robust optimization problem that maximizes a vehicle owner's expected profit from selling primary frequency regulation to the grid and guarantees that market commitments are met at all times for all frequency deviation… ▽ More Problem definition: Vehicle-to-grid increases the low utilization rate of privately owned electric vehicles by making their batteries available to electricity grids. We formulate a robust optimization problem that maximizes a vehicle owner's expected profit from selling primary frequency regulation to the grid and guarantees that market commitments are met at all times for all frequency deviation trajectories in a functional uncertainty set that encodes applicable legislation. Faithfully modeling the energy conversion losses during battery charging and discharging renders this optimization problem non-convex. Methodology/results: By exploiting a total unimodularity property of the uncertainty set and an exact linear decision rule reformulation, we prove that this non-convex robust optimization problem with functional uncertainties is equivalent to a tractable linear program. Through extensive numerical experiments using real-world data, we quantify the economic value of vehicle-to-grid and elucidate the financial incentives of vehicle owners, aggregators, equipment manufacturers, and regulators. Managerial implications: We find that the prevailing penalties for non-delivery of promised regulation power are too low to incentivize vehicle owners to honor the delivery guarantees given to grid operators. △ Less

Submitted 9 February, 2024; v1 submitted 12 May, 2020; originally announced May 2020.

Journal ref: Manufacturing & Service Operations Management 26(2):722-738, 2024

arXiv:2004.07162 [pdf, ps, other]

On Linear Optimization over Wasserstein Balls

Authors: Man-Chung Yue, Daniel Kuhn, Wolfram Wiesemann

Abstract: Wasserstein balls, which contain all probability measures within a pre-specified Wasserstein distance to a reference measure, have recently enjoyed wide popularity in the distributionally robust optimization and machine learning communities to formulate and solve data-driven optimization problems with rigorous statistical guarantees. In this technical note we prove that the Wasserstein ball is wea… ▽ More Wasserstein balls, which contain all probability measures within a pre-specified Wasserstein distance to a reference measure, have recently enjoyed wide popularity in the distributionally robust optimization and machine learning communities to formulate and solve data-driven optimization problems with rigorous statistical guarantees. In this technical note we prove that the Wasserstein ball is weakly compact under mild conditions, and we offer necessary and sufficient conditions for the existence of optimal solutions. We also characterize the sparsity of solutions if the Wasserstein ball is centred at a discrete reference measure. In comparison with the existing literature, which has proved similar results under different conditions, our proofs are self-contained and shorter, yet mathematically rigorous, and our necessary and sufficient conditions for the existence of optimal solutions are easily verifiable in practice. △ Less

Submitted 6 June, 2021; v1 submitted 15 April, 2020; originally announced April 2020.

arXiv:1911.08887 [pdf, ps, other]

doi 10.1017/S0963548320000619

Counting Hamilton cycles in Dirac hypergraphs

Authors: Stefan Glock, Stephen Gould, Felix Joos, Daniela Kühn, Deryk Osthus

Abstract: A tight Hamilton cycle in a $k$-uniform hypergraph ($k$-graph) $G$ is a cyclic ordering of the vertices of $G$ such that every set of $k$ consecutive vertices in the ordering forms an edge. Rödl, Ruciński, and Szemerédi proved that for $k\geq 3$, every $k$-graph on $n$ vertices with minimum codegree at least $n/2+o(n)$ contains a tight Hamilton cycle. We show that the number of tight Hamilton cycl… ▽ More A tight Hamilton cycle in a $k$-uniform hypergraph ($k$-graph) $G$ is a cyclic ordering of the vertices of $G$ such that every set of $k$ consecutive vertices in the ordering forms an edge. Rödl, Ruciński, and Szemerédi proved that for $k\geq 3$, every $k$-graph on $n$ vertices with minimum codegree at least $n/2+o(n)$ contains a tight Hamilton cycle. We show that the number of tight Hamilton cycles in such $k$-graphs is $\exp(n\ln n-Θ(n))$. As a corollary, we obtain a similar estimate on the number of Hamilton $\ell$-cycles in such $k$-graphs for all $\ell\in\{0,\dots,k-1\}$, which makes progress on a question of Ferber, Krivelevich and Sudakov. △ Less

Submitted 10 November, 2020; v1 submitted 20 November, 2019; originally announced November 2019.

Comments: 20 pages. Final version, to appear in Combinatorics, Probability & Computing

Journal ref: Combinator. Probab. Comp. 30 (2021) 631-653

arXiv:1911.05501 [pdf, ps, other]

doi 10.1112/jlms.12455

Path and cycle decompositions of dense graphs

Authors: António Girão, Bertille Granet, Daniela Kühn, Deryk Osthus

Abstract: We make progress on three long standing conjectures from the 1960s about path and cycle decompositions of graphs. Gallai conjectured that any connected graph on $n$ vertices can be decomposed into at most $\left\lceil \frac{n}{2}\right\rceil$ paths, while a conjecture of Hajós states that any Eulerian graph on $n$ vertices can be decomposed into at most $\left\lfloor \frac{n-1}{2}\right\rfloor$ cy… ▽ More We make progress on three long standing conjectures from the 1960s about path and cycle decompositions of graphs. Gallai conjectured that any connected graph on $n$ vertices can be decomposed into at most $\left\lceil \frac{n}{2}\right\rceil$ paths, while a conjecture of Hajós states that any Eulerian graph on $n$ vertices can be decomposed into at most $\left\lfloor \frac{n-1}{2}\right\rfloor$ cycles. The Erdős-Gallai conjecture states that any graph on $n$ vertices can be decomposed into $O(n)$ cycles and edges. We show that if $G$ is a sufficiently large graph on $n$ vertices with linear minimum degree, then the following hold. (i) $G$ can be decomposed into at most $\frac{n}{2}+o(n)$ paths. (ii) If $G$ is Eulerian, then it can be decomposed into at most $\frac{n}{2}+o(n)$ cycles. (iii) $G$ can be decomposed into at most $\frac{3 n}{2}+o(n)$ cycles and edges. If in addition $G$ satisfies a weak expansion property, we asymptotically determine the required number of paths/cycles for each such $G$. (iv) $G$ can be decomposed into $\max \left\{\frac{odd(G)}{2},\frac{Δ(G)}{2}\right\}+o(n)$ paths, where $odd(G)$ is the number of odd-degree vertices of $G$. (v) If $G$ is Eulerian, then it can be decomposed into $\frac{Δ(G)}{2}+o(n)$ cycles. All bounds in (i)-(v) are asymptotically best possible. △ Less

Submitted 15 March, 2021; v1 submitted 13 November, 2019; originally announced November 2019.

Comments: 48 pages, 2 figures; final version, to appear in the Journal of the London Mathematical Society

Journal ref: J. London Math. Soc., 104 (2021): 1085-1134

arXiv:1911.03539 [pdf, other]

Bridging Bayesian and Minimax Mean Square Error Estimation via Wasserstein Distributionally Robust Optimization

Authors: Viet Anh Nguyen, Soroosh Shafieezadeh-Abadeh, Daniel Kuhn, Peyman Mohajerin Esfahani

Abstract: We introduce a distributionally robust minimium mean square error estimation model with a Wasserstein ambiguity set to recover an unknown signal from a noisy observation. The proposed model can be viewed as a zero-sum game between a statistician choosing an estimator -- that is, a measurable function of the observation -- and a fictitious adversary choosing a prior -- that is, a pair of signal and… ▽ More We introduce a distributionally robust minimium mean square error estimation model with a Wasserstein ambiguity set to recover an unknown signal from a noisy observation. The proposed model can be viewed as a zero-sum game between a statistician choosing an estimator -- that is, a measurable function of the observation -- and a fictitious adversary choosing a prior -- that is, a pair of signal and noise distributions ranging over independent Wasserstein balls -- with the goal to minimize and maximize the expected squared estimation error, respectively. We show that if the Wasserstein balls are centered at normal distributions, then the zero-sum game admits a Nash equilibrium, where the players' optimal strategies are given by an {\em affine} estimator and a {\em normal} prior, respectively. We further prove that this Nash equilibrium can be computed by solving a tractable convex program. Finally, we develop a Frank-Wolfe algorithm that can solve this convex program orders of magnitude faster than state-of-the-art general purpose solvers. We show that this algorithm enjoys a linear convergence rate and that its direction-finding subproblems can be solved in quasi-closed form. △ Less

Submitted 27 January, 2021; v1 submitted 8 November, 2019; originally announced November 2019.

arXiv:1910.10583 [pdf, other]

Optimistic Distributionally Robust Optimization for Nonparametric Likelihood Approximation

Authors: Viet Anh Nguyen, Soroosh Shafieezadeh-Abadeh, Man-Chung Yue, Daniel Kuhn, Wolfram Wiesemann

Abstract: The likelihood function is a fundamental component in Bayesian statistics. However, evaluating the likelihood of an observation is computationally intractable in many applications. In this paper, we propose a non-parametric approximation of the likelihood that identifies a probability measure which lies in the neighborhood of the nominal measure and that maximizes the probability of observing the… ▽ More The likelihood function is a fundamental component in Bayesian statistics. However, evaluating the likelihood of an observation is computationally intractable in many applications. In this paper, we propose a non-parametric approximation of the likelihood that identifies a probability measure which lies in the neighborhood of the nominal measure and that maximizes the probability of observing the given sample point. We show that when the neighborhood is constructed by the Kullback-Leibler divergence, by moment conditions or by the Wasserstein distance, then our \textit{optimistic likelihood} can be determined through the solution of a convex optimization problem, and it admits an analytical expression in particular cases. We also show that the posterior inference problem with our optimistic likelihood approximation enjoys strong theoretical performance guarantees, and it performs competitively in a probabilistic classification task. △ Less

Submitted 23 October, 2019; originally announced October 2019.

arXiv:1910.07817 [pdf, other]

Calculating Optimistic Likelihoods Using (Geodesically) Convex Optimization

Authors: Viet Anh Nguyen, Soroosh Shafieezadeh-Abadeh, Man-Chung Yue, Daniel Kuhn, Wolfram Wiesemann

Abstract: A fundamental problem arising in many areas of machine learning is the evaluation of the likelihood of a given observation under different nominal distributions. Frequently, these nominal distributions are themselves estimated from data, which makes them susceptible to estimation errors. We thus propose to replace each nominal distribution with an ambiguity set containing all distributions in its… ▽ More A fundamental problem arising in many areas of machine learning is the evaluation of the likelihood of a given observation under different nominal distributions. Frequently, these nominal distributions are themselves estimated from data, which makes them susceptible to estimation errors. We thus propose to replace each nominal distribution with an ambiguity set containing all distributions in its vicinity and to evaluate an \emph{optimistic likelihood}, that is, the maximum of the likelihood over all distributions in the ambiguity set. When the proximity of distributions is quantified by the Fisher-Rao distance or the Kullback-Leibler divergence, the emerging optimistic likelihoods can be computed efficiently using either geodesic or standard convex optimization techniques. We showcase the advantages of working with optimistic likelihoods on a classification problem using synthetic as well as empirical data. △ Less

Submitted 17 October, 2019; originally announced October 2019.

arXiv:1908.08729 [pdf, other]

Wasserstein Distributionally Robust Optimization: Theory and Applications in Machine Learning

Authors: Daniel Kuhn, Peyman Mohajerin Esfahani, Viet Anh Nguyen, Soroosh Shafieezadeh-Abadeh

Abstract: Many decision problems in science, engineering and economics are affected by uncertain parameters whose distribution is only indirectly observable through samples. The goal of data-driven decision-making is to learn a decision from finitely many training samples that will perform well on unseen test samples. This learning task is difficult even if all training and test samples are drawn from the s… ▽ More Many decision problems in science, engineering and economics are affected by uncertain parameters whose distribution is only indirectly observable through samples. The goal of data-driven decision-making is to learn a decision from finitely many training samples that will perform well on unseen test samples. This learning task is difficult even if all training and test samples are drawn from the same distribution---especially if the dimension of the uncertainty is large relative to the training sample size. Wasserstein distributionally robust optimization seeks data-driven decisions that perform well under the most adverse distribution within a certain Wasserstein distance from a nominal distribution constructed from the training samples. In this tutorial we will argue that this approach has many conceptual and computational benefits. Most prominently, the optimal decisions can often be computed by solving tractable convex optimization problems, and they enjoy rigorous out-of-sample and asymptotic consistency guarantees. We will also show that Wasserstein distributionally robust optimization has interesting ramifications for statistical learning and motivates new approaches for fundamental learning tasks such as classification, regression, maximum likelihood estimation or minimum mean square error estimation, among others. △ Less

Submitted 23 August, 2019; originally announced August 2019.

Comments: 36 pages

arXiv:1903.05052 [pdf, ps, other]

Dirac's theorem for random regular graphs

Authors: Padraig Condon, Alberto Espuny Díaz, António Girão, Daniela Kühn, Deryk Osthus

Abstract: We prove a `resilience' version of Dirac's theorem in the setting of random regular graphs. More precisely, we show that, whenever $d$ is sufficiently large compared to $\varepsilon>0$, a.a.s. the following holds: let $G'$ be any subgraph of the random $n$-vertex $d$-regular graph $G_{n,d}$ with minimum degree at least $(1/2+\varepsilon)d$. Then $G'$ is Hamiltonian. This proves a conjecture of B… ▽ More We prove a `resilience' version of Dirac's theorem in the setting of random regular graphs. More precisely, we show that, whenever $d$ is sufficiently large compared to $\varepsilon>0$, a.a.s. the following holds: let $G'$ be any subgraph of the random $n$-vertex $d$-regular graph $G_{n,d}$ with minimum degree at least $(1/2+\varepsilon)d$. Then $G'$ is Hamiltonian. This proves a conjecture of Ben-Shimon, Krivelevich and Sudakov. Our result is best possible: firstly, the condition that $d$ is large cannot be omitted, and secondly, the minimum degree bound cannot be improved. △ Less

Submitted 23 June, 2020; v1 submitted 12 March, 2019; originally announced March 2019.

Comments: Final accepted version, to appear in Combinatorics, Probability & Computing

arXiv:1903.04262 [pdf, ps, other]

Decompositions into isomorphic rainbow spanning trees

Authors: Stefan Glock, Daniela Kühn, Richard Montgomery, Deryk Osthus

Abstract: A subgraph of an edge-coloured graph is called rainbow if all its edges have distinct colours. Our main result implies that, given any optimal colouring of a sufficiently large complete graph $K_{2n}$, there exists a decomposition of $K_{2n}$ into isomorphic rainbow spanning trees. This settles conjectures of Brualdi--Hollingsworth (from 1996) and Constantine (from 2002) for large graphs. A subgraph of an edge-coloured graph is called rainbow if all its edges have distinct colours. Our main result implies that, given any optimal colouring of a sufficiently large complete graph $K_{2n}$, there exists a decomposition of $K_{2n}$ into isomorphic rainbow spanning trees. This settles conjectures of Brualdi--Hollingsworth (from 1996) and Constantine (from 2002) for large graphs. △ Less

Submitted 6 March, 2020; v1 submitted 11 March, 2019; originally announced March 2019.

Comments: Version accepted to appear in JCTB

arXiv:1810.12433 [pdf, ps, other]

Resilient degree sequences with respect to Hamilton cycles and matchings in random graphs

Authors: Padraig Condon, Alberto Espuny Díaz, Jaehoon Kim, Daniela Kühn, Deryk Osthus

Abstract: Pósa's theorem states that any graph $G$ whose degree sequence $d_1 \le \ldots \le d_n$ satisfies $d_i \ge i+1$ for all $i < n/2$ has a Hamilton cycle. This degree condition is best possible. We show that a similar result holds for suitable subgraphs $G$ of random graphs, i.e. we prove a `resilience version' of Pósa's theorem: if $pn \ge C \log n$ and the $i$-th vertex degree (ordered increasingly… ▽ More Pósa's theorem states that any graph $G$ whose degree sequence $d_1 \le \ldots \le d_n$ satisfies $d_i \ge i+1$ for all $i < n/2$ has a Hamilton cycle. This degree condition is best possible. We show that a similar result holds for suitable subgraphs $G$ of random graphs, i.e. we prove a `resilience version' of Pósa's theorem: if $pn \ge C \log n$ and the $i$-th vertex degree (ordered increasingly) of $G \subseteq G_{n,p}$ is at least $(i+o(n))p$ for all $i<n/2$, then $G$ has a Hamilton cycle. This is essentially best possible and strengthens a resilience version of Dirac's theorem obtained by Lee and Sudakov. Chvátal's theorem generalises Pósa's theorem and characterises all degree sequences which ensure the existence of a Hamilton cycle. We show that a natural guess for a resilience version of Chvátal's theorem fails to be true. We formulate a conjecture which would repair this guess, and show that the corresponding degree conditions ensure the existence of a perfect matching in any subgraph of $G_{n,p}$ which satisfies these conditions. This provides an asymptotic characterisation of all degree sequences which resiliently guarantee the existence of a perfect matching. △ Less

Submitted 2 December, 2019; v1 submitted 29 October, 2018; originally announced October 2018.

Comments: To appear in the Electronic Journal of Combinatorics. This version corrects a couple of typos

arXiv:1809.08830 [pdf, other]

Wasserstein Distributionally Robust Kalman Filtering

Authors: Soroosh Shafieezadeh-Abadeh, Viet Anh Nguyen, Daniel Kuhn, Peyman Mohajerin Esfahani

Abstract: We study a distributionally robust mean square error estimation problem over a nonconvex Wasserstein ambiguity set containing only normal distributions. We show that the optimal estimator and the least favorable distribution form a Nash equilibrium. Despite the non-convex nature of the ambiguity set, we prove that the estimation problem is equivalent to a tractable convex program. We further devis… ▽ More We study a distributionally robust mean square error estimation problem over a nonconvex Wasserstein ambiguity set containing only normal distributions. We show that the optimal estimator and the least favorable distribution form a Nash equilibrium. Despite the non-convex nature of the ambiguity set, we prove that the estimation problem is equivalent to a tractable convex program. We further devise a Frank-Wolfe algorithm for this convex program whose direction-searching subproblem can be solved in a quasi-closed form. Using these ingredients, we introduce a distributionally robust Kalman filter that hedges against model risk. △ Less

Submitted 1 October, 2018; v1 submitted 24 September, 2018; originally announced September 2018.

arXiv:1809.00210 [pdf, other]

Data-Driven Chance Constrained Programs over Wasserstein Balls

Authors: Zhi Chen, Daniel Kuhn, Wolfram Wiesemann

Abstract: We provide an exact deterministic reformulation for data-driven chance constrained programs over Wasserstein balls. For individual chance constraints as well as joint chance constraints with right-hand side uncertainty, our reformulation amounts to a mixed-integer conic program. In the special case of a Wasserstein ball with the $1$-norm or the $\infty$-norm, the cone is the nonnegative orthant, a… ▽ More We provide an exact deterministic reformulation for data-driven chance constrained programs over Wasserstein balls. For individual chance constraints as well as joint chance constraints with right-hand side uncertainty, our reformulation amounts to a mixed-integer conic program. In the special case of a Wasserstein ball with the $1$-norm or the $\infty$-norm, the cone is the nonnegative orthant, and the chance constrained program can be reformulated as a mixed-integer linear program. Our reformulation compares favourably to several state-of-the-art data-driven optimization schemes in our numerical experiments. △ Less

Submitted 31 May, 2022; v1 submitted 1 September, 2018; originally announced September 2018.

Comments: 25 pages, 9 figures

Showing 1–50 of 129 results for author: Kuhn, D