-
Mitigating the Impact of Labeling Errors on Training via Rockafellian Relaxation
Authors:
Louis L. Chen,
Bobbie Chern,
Eric Eckstrand,
Amogh Mahapatra,
Johannes O. Royset
Abstract:
Labeling errors in datasets are common, if not systematic, in practice. They naturally arise in a variety of contexts-human labeling, noisy labeling, and weak labeling (i.e., image classification), for example. This presents a persistent and pervasive stress on machine learning practice. In particular, neural network (NN) architectures can withstand minor amounts of dataset imperfection with tradi…
▽ More
Labeling errors in datasets are common, if not systematic, in practice. They naturally arise in a variety of contexts-human labeling, noisy labeling, and weak labeling (i.e., image classification), for example. This presents a persistent and pervasive stress on machine learning practice. In particular, neural network (NN) architectures can withstand minor amounts of dataset imperfection with traditional countermeasures such as regularization, data augmentation, and batch normalization. However, major dataset imperfections often prove insurmountable. We propose and study the implementation of Rockafellian Relaxation (RR), a new loss reweighting, architecture-independent methodology, for neural network training. Experiments indicate RR can enhance standard neural network methods to achieve robust performance across classification tasks in computer vision and natural language processing (sentiment analysis). We find that RR can mitigate the effects of dataset corruption due to both (heavy) labeling error and/or adversarial perturbation, demonstrating effectiveness across a variety of data domains and machine learning tasks.
△ Less
Submitted 30 May, 2024;
originally announced May 2024.
-
Rockafellian Relaxation for PDE-Constrained Optimization with Distributional Uncertainty
Authors:
Harbir Antil,
Sean P. Carney,
Hugo Díaz,
Johannes O. Royset
Abstract:
Stochastic optimization problems are generally known to be ill-conditioned to the form of the underlying uncertainty. A framework is introduced for optimal control problems with partial differential equations as constraints that is robust to inaccuracies in the precise form of the problem uncertainty. The framework is based on problem relaxation and involves optimizing a bivariate, "Rockafellian"…
▽ More
Stochastic optimization problems are generally known to be ill-conditioned to the form of the underlying uncertainty. A framework is introduced for optimal control problems with partial differential equations as constraints that is robust to inaccuracies in the precise form of the problem uncertainty. The framework is based on problem relaxation and involves optimizing a bivariate, "Rockafellian" objective functional that features both a standard control variable and an additional perturbation variable that handles the distributional ambiguity. In the presence of distributional corruption, the Rockafellian objective functionals are shown in the appropriate settings to $Γ$-converge to uncorrupted objective functionals in the limit of vanishing corruption. Numerical examples illustrate the framework's utility for outlier detection and removal and for variance reduction.
△ Less
Submitted 30 April, 2024;
originally announced May 2024.
-
Approximations of Rockafellians, Lagrangians, and Dual Functions
Authors:
Julio Deride,
Johannes O. Royset
Abstract:
Solutions of an optimization problem are sensitive to changes caused by approximations or parametric perturbations, especially in the nonconvex setting. This paper investigates the ability of substitute problems, constructed from Rockafellian functions, to provide robustness against such perturbations. Unlike classical stability analysis focused on local changes around (local) minimizers, we emplo…
▽ More
Solutions of an optimization problem are sensitive to changes caused by approximations or parametric perturbations, especially in the nonconvex setting. This paper investigates the ability of substitute problems, constructed from Rockafellian functions, to provide robustness against such perturbations. Unlike classical stability analysis focused on local changes around (local) minimizers, we employ epi-convergence to examine whether the approximating problems suitably approach the actual one globally. We show that under natural assumptions the substitute problems can be well-behaved in the sense of epi-convergence even though the actual one is not. We further quantify the rates of convergence that often lead to Lipschitz-kind stability properties for the substitute problems.
△ Less
Submitted 28 April, 2024;
originally announced April 2024.
-
Risk-Adaptive Local Decision Rules
Authors:
Johannes O. Royset,
Miguel A. Lejeune
Abstract:
For parameterized mixed-binary optimization problems, we construct local decision rules that prescribe near-optimal courses of action across a set of parameter values. The decision rules stem from solving risk-adaptive training problems over classes of continuous, possibly nonlinear map**s. In asymptotic and nonasymptotic analysis, we establish that the decision rules prescribe near-optimal deci…
▽ More
For parameterized mixed-binary optimization problems, we construct local decision rules that prescribe near-optimal courses of action across a set of parameter values. The decision rules stem from solving risk-adaptive training problems over classes of continuous, possibly nonlinear map**s. In asymptotic and nonasymptotic analysis, we establish that the decision rules prescribe near-optimal decisions locally for the actual problems, without relying on linearity, convexity, or smoothness. The development also accounts for practically important aspects such as inexact function evaluations, solution tolerances in training problems, regularization, and reformulations to solver-friendly models. The decision rules also furnish a means to carry out sensitivity and stability analysis for broad classes of parameterized optimization problems. We develop a decomposition algorithm for solving the resulting training problems and demonstrate its ability to generate quality decision rules on a nonlinear binary optimization model from search theory.
△ Less
Submitted 22 April, 2024; v1 submitted 15 October, 2023;
originally announced October 2023.
-
On Memorization and Privacy Risks of Sharpness Aware Minimization
Authors:
Young In Kim,
Pratiksha Agrawal,
Johannes O. Royset,
Rajiv Khanna
Abstract:
In many recent works, there is an increased focus on designing algorithms that seek flatter optima for neural network loss optimization as there is empirical evidence that it leads to better generalization performance in many datasets. In this work, we dissect these performance gains through the lens of data memorization in overparameterized models. We define a new metric that helps us identify wh…
▽ More
In many recent works, there is an increased focus on designing algorithms that seek flatter optima for neural network loss optimization as there is empirical evidence that it leads to better generalization performance in many datasets. In this work, we dissect these performance gains through the lens of data memorization in overparameterized models. We define a new metric that helps us identify which data points specifically do algorithms seeking flatter optima do better when compared to vanilla SGD. We find that the generalization gains achieved by Sharpness Aware Minimization (SAM) are particularly pronounced for atypical data points, which necessitate memorization. This insight helps us unearth higher privacy risks associated with SAM, which we verify through exhaustive empirical evaluations. Finally, we propose mitigation strategies to achieve a more desirable accuracy vs privacy tradeoff.
△ Less
Submitted 3 January, 2024; v1 submitted 30 September, 2023;
originally announced October 2023.
-
Multi-Agent Search for a Moving and Camouflaging Target
Authors:
Miguel Lejeune,
Johannes O. Royset,
Wenbo Ma
Abstract:
In multi-agent search planning for a randomly moving and camouflaging target, we examine heterogeneous searchers that differ in terms of their endurance level, travel speed, and detection ability. This leads to a convex mixed-integer nonlinear program, which we reformulate using three linearization techniques. We develop preprocessing steps, outer approximations via lazy constraints, and bundle-ba…
▽ More
In multi-agent search planning for a randomly moving and camouflaging target, we examine heterogeneous searchers that differ in terms of their endurance level, travel speed, and detection ability. This leads to a convex mixed-integer nonlinear program, which we reformulate using three linearization techniques. We develop preprocessing steps, outer approximations via lazy constraints, and bundle-based cutting plane methods to address large-scale instances. Further specializations emerge when the target moves according to a Markov chain. We carry out an extensive numerical study to show the computational efficiency of our methods and to derive insights regarding which approach should be favored for which type of problem instance.
△ Less
Submitted 1 November, 2023; v1 submitted 5 September, 2023;
originally announced September 2023.
-
A variational approach to a cumulative distribution function estimation problem under stochastic ambiguity
Authors:
Julio Deride,
Johannes O. Royset,
Fernanda Urrea
Abstract:
We propose a method for finding a cumulative distribution function (cdf) that minimizes the (regularized) distance to a given cdf, while belonging to an ambiguity set constructed relative to another cdf and, possibly, incorporating soft information. Our method embeds the family of cdfs onto the space of upper semicontinuous functions endowed with the hypo-distance. In this setting, we present an a…
▽ More
We propose a method for finding a cumulative distribution function (cdf) that minimizes the (regularized) distance to a given cdf, while belonging to an ambiguity set constructed relative to another cdf and, possibly, incorporating soft information. Our method embeds the family of cdfs onto the space of upper semicontinuous functions endowed with the hypo-distance. In this setting, we present an approximation scheme based on epi-splines, defined as piecewise polynomial functions, and use bounds for estimating the hypo-distance. Under appropriate hypotheses, we guarantee that the cluster points corresponding to the sequence of minimizers of the resulting approximating problems are solutions to a limiting problem. In addition, we describe a large class of functions that satisfy these hypotheses. The approximating method produces a linear-programming-based approximation scheme, enabling us to develop an algorithm from off-the-shelf solvers. The convergence of our proposed approximation is illustrated by numerical examples for the bivariate case, one of which entails a Lipschitz condition.
△ Less
Submitted 31 August, 2023;
originally announced September 2023.
-
Risk-Adaptive Approaches to Stochastic Optimization: A Survey
Authors:
Johannes O. Royset
Abstract:
Uncertainty is prevalent in engineering design, data-driven problems, and decision making broadly. Due to inherent risk-averseness and ambiguity about assumptions, it is common to address uncertainty by formulating and solving conservative optimization models expressed using measures of risk and related concepts. We survey the rapid development of risk measures over the last quarter century. From…
▽ More
Uncertainty is prevalent in engineering design, data-driven problems, and decision making broadly. Due to inherent risk-averseness and ambiguity about assumptions, it is common to address uncertainty by formulating and solving conservative optimization models expressed using measures of risk and related concepts. We survey the rapid development of risk measures over the last quarter century. From their beginning in financial engineering, we recount the spread to nearly all areas of engineering and applied mathematics. Solidly rooted in convex analysis, risk measures furnish a general framework for handling uncertainty with significant computational and theoretical advantages. We describe the key facts, list several concrete algorithms, and provide an extensive list of references for further reading. The survey recalls connections with utility theory and distributionally robust optimization, points to emerging applications areas such as fair machine learning, and defines measures of reliability.
△ Less
Submitted 3 April, 2024; v1 submitted 1 December, 2022;
originally announced December 2022.
-
Optimizing Surveillance Satellites for the Synthetic Theater Operations Research Model
Authors:
Steven M. Warner,
Johannes O. Royset
Abstract:
The Synthetic Theater Operations Research Model (STORM) simulates theater-level conflict and requires inputs about utilization of surveillance satellites to search large geographical areas. We develop a mixed-integer linear optimization model that prescribes plans for how satellites and their sensors should be directed to best search an area of operations. It also specifies the resolution levels e…
▽ More
The Synthetic Theater Operations Research Model (STORM) simulates theater-level conflict and requires inputs about utilization of surveillance satellites to search large geographical areas. We develop a mixed-integer linear optimization model that prescribes plans for how satellites and their sensors should be directed to best search an area of operations. It also specifies the resolution levels employed by the sensors to ensure a suitable fidelity of the resulting images. We solve large-scale instances of the model involving up to 22 million variables and 11 million constraints in scenarios derived from STORM. On average, the model yields 55% improvement in search coverage relative to an existing heuristic algorithm in STORM.
△ Less
Submitted 20 October, 2022;
originally announced October 2022.
-
S-BORM: Reliability-based optimization of general systems using buffered optimization and reliability method
Authors:
Ji-Eun Byun,
Welington de Oliveira,
Johannes O. Royset
Abstract:
Reliability-based optimization (RBO) is crucial for identifying optimal risk-informed decisions for designing and operating engineering systems. However, its computation remains challenging as it requires a concurrent task of optimization and reliability analysis. Moreover, computation becomes even more complicated when considering performance of a general system, whose failure event is represente…
▽ More
Reliability-based optimization (RBO) is crucial for identifying optimal risk-informed decisions for designing and operating engineering systems. However, its computation remains challenging as it requires a concurrent task of optimization and reliability analysis. Moreover, computation becomes even more complicated when considering performance of a general system, whose failure event is represented as a link-set of cut-sets. This is because even when component events have smooth and convex limit-state functions, the system limit-state function has neither property, except in trivial cases. To address the challenge, this study develops an efficient algorithm to solve RBO problems of general system events. We employ the buffered optimization and reliability method (BORM), which utilizes, instead of the conventional failure probability definition, the buffered failure probability. The proposed algorithm solves a sequence of difference-of-convex RBO models iteratively by employing a proximal bundle method. For demonstration, we design three numerical examples with increasing complexity that includes up to 108 cut-sets, which are solved by the proposed algorithm within a minute with high accuracy. We also demonstrate its robustness by performing extensive parametric studies.
△ Less
Submitted 7 October, 2022; v1 submitted 6 September, 2022;
originally announced September 2022.
-
On Robustness in Nonconvex Optimization with Application to Defense Planning
Authors:
Johannes O. Royset
Abstract:
In the context of structured nonconvex optimization, we estimate the increase in minimum value for a decision that is robust to parameter perturbations as compared to the value of a nominal problem. The estimates rely on detailed expressions for subgradients and local Lipschitz moduli of min-value functions in nonconvex robust optimization and require only the solution of the nominal problem. The…
▽ More
In the context of structured nonconvex optimization, we estimate the increase in minimum value for a decision that is robust to parameter perturbations as compared to the value of a nominal problem. The estimates rely on detailed expressions for subgradients and local Lipschitz moduli of min-value functions in nonconvex robust optimization and require only the solution of the nominal problem. The theoretical results are illustrated by examples from military operations research involving mixed-integer optimization models. Across 54 cases examined, the median error in estimating the increase in minimum value is 12%. Therefore, the derived expressions for subgradients and local Lipschitz moduli may accurately inform analysts about the possibility of obtaining cost-effective, parameter-robust decisions in nonconvex optimization.
△ Less
Submitted 21 November, 2022; v1 submitted 20 August, 2022;
originally announced August 2022.
-
Epi-Convergence of Expectation Functions under Varying Measures and Integrands
Authors:
Eugene A. Feinberg,
Pavlo O. Kasyanov,
Johannes O. Royset
Abstract:
For expectation functions on metric spaces, we provide sufficient conditions for epi-convergence under varying probability measures and integrands, and examine applications in the area of sieve estimators, mollifier smoothing, PDE-constrained optimization, and stochastic optimization with expectation constraints. As a step** stone to epi-convergence of independent interest, we develop parametric…
▽ More
For expectation functions on metric spaces, we provide sufficient conditions for epi-convergence under varying probability measures and integrands, and examine applications in the area of sieve estimators, mollifier smoothing, PDE-constrained optimization, and stochastic optimization with expectation constraints. As a step** stone to epi-convergence of independent interest, we develop parametric Fatou's lemmas under mild integrability assumptions. In the setting of Suslin metric spaces, the assumptions are expressed in terms of Pasch-Hausdorff envelopes. For general metric spaces, the assumptions shift to semicontinuity of integrands also on the sample space, which then is assumed to be a metric space.
△ Less
Submitted 7 August, 2022;
originally announced August 2022.
-
Rockafellian Relaxation and Stochastic Optimization under Perturbations
Authors:
Johannes O. Royset,
Louis L. Chen,
Eric Eckstrand
Abstract:
In practice, optimization models are often prone to unavoidable inaccuracies due to dubious assumptions and corrupted data. Traditionally, this placed special emphasis on risk-based and robust formulations, and their focus on ``conservative" decisions. We develop, in contrast, an ``optimistic" framework based on Rockafellian relaxations in which optimization is conducted not only over the original…
▽ More
In practice, optimization models are often prone to unavoidable inaccuracies due to dubious assumptions and corrupted data. Traditionally, this placed special emphasis on risk-based and robust formulations, and their focus on ``conservative" decisions. We develop, in contrast, an ``optimistic" framework based on Rockafellian relaxations in which optimization is conducted not only over the original decision space but also jointly with a choice of model perturbation. The framework enables us to address challenging problems with ambiguous probability distributions from the areas of two-stage stochastic optimization without relatively complete recourse, probability functions lacking continuity properties, expectation constraints, and outlier analysis. We are also able to circumvent the fundamental difficulty in stochastic optimization that convergence of distributions fails to guarantee convergence of expectations. The framework centers on the novel concepts of exact and limit-exact Rockafellians, with interpretations of ``negative'' regularization emerging in certain settings. We illustrate the role of Phi-divergence, examine rates of convergence under changing distributions, and explore extensions to first-order optimality conditions. The main development is free of assumptions about convexity, smoothness, and even continuity of objective functions. Numerical results in the setting of computer vision and text analytics with label noise illustrate the framework.
△ Less
Submitted 20 November, 2023; v1 submitted 10 April, 2022;
originally announced April 2022.
-
Consistent Approximations in Composite Optimization
Authors:
Johannes O. Royset
Abstract:
Approximations of optimization problems arise in computational procedures and sensitivity analysis. The resulting effect on solutions can be significant, with even small approximations of components of a problem translating into large errors in the solutions. We specify conditions under which approximations are well behaved in the sense of minimizers, stationary points, and level-sets and this lea…
▽ More
Approximations of optimization problems arise in computational procedures and sensitivity analysis. The resulting effect on solutions can be significant, with even small approximations of components of a problem translating into large errors in the solutions. We specify conditions under which approximations are well behaved in the sense of minimizers, stationary points, and level-sets and this leads to a framework of consistent approximations. The framework is developed for a broad class of composite problems, which are neither convex nor smooth. We demonstrate the framework using examples from stochastic optimization, neural-network based machine learning, distributionally robust optimization, penalty and augmented Lagrangian methods, interior-point methods, homotopy methods, smoothing methods, extended nonlinear programming, difference-of-convex programming, and multi-objective optimization. An enhanced proximal method illustrates the algorithmic possibilities. A quantitative analysis supplements the development by furnishing rates of convergence.
△ Less
Submitted 8 August, 2022; v1 submitted 13 January, 2022;
originally announced January 2022.
-
Performance Bounds for PDE-Constrained Optimization under Uncertainty
Authors:
Peng Chen,
Johannes O. Royset
Abstract:
Computational approaches to PDE-constrained optimization under uncertainty may involve finite-dimensional approximations of control and state spaces, sample average approximations of measures of risk and reliability, smooth approximations of nonsmooth functions, penalty approximations of constraints as well as many other kinds of inaccuracies. In this paper, we analyze the performance of controls…
▽ More
Computational approaches to PDE-constrained optimization under uncertainty may involve finite-dimensional approximations of control and state spaces, sample average approximations of measures of risk and reliability, smooth approximations of nonsmooth functions, penalty approximations of constraints as well as many other kinds of inaccuracies. In this paper, we analyze the performance of controls obtained by an approximation-based algorithm and in the process develop estimates of optimality gaps for general optimization problems defined on metric spaces. Under mild assumptions, we establish that limiting controls have arbitrarily small optimality gaps provided that the inaccuracies in the various approximations vanish. We carry out the analysis for a broad class of problems with multiple expectation, risk, and reliability functions involving PDE solutions and appearing in objective as well as constraint expressions. In particular, we address problems with buffered failure probability constraints approximated via an augmented Lagrangian. We demonstrate the framework on an elliptic PDE with a random coefficient field and a distributed control function.
△ Less
Submitted 31 August, 2022; v1 submitted 19 October, 2021;
originally announced October 2021.
-
Gradients and Subgradients of Buffered Failure Probability
Authors:
Johannes O. Royset,
Ji-Eun Byun
Abstract:
Gradients and subgradients are central to optimization and sensitivity analysis of buffered failure probabilities. We furnish a characterization of subgradients based on subdifferential calculus in the case of finite probability distributions and, under additional assumptions, also a gradient expression for general distributions. Several examples illustrate the application of the results, especial…
▽ More
Gradients and subgradients are central to optimization and sensitivity analysis of buffered failure probabilities. We furnish a characterization of subgradients based on subdifferential calculus in the case of finite probability distributions and, under additional assumptions, also a gradient expression for general distributions. Several examples illustrate the application of the results, especially in the context of optimality conditions.
△ Less
Submitted 22 October, 2021; v1 submitted 11 September, 2021;
originally announced September 2021.
-
Data-driven optimization of reliability using buffered failure probability
Authors:
Ji-Eun Byun,
Johannes O. Royset
Abstract:
Design and operation of complex engineering systems rely on reliability optimization. Such optimization requires us to account for uncertainties expressed in terms of compli-cated, high-dimensional probability distributions, for which only samples or data might be available. However, using data or samples often degrades the computational efficiency, particularly as the conventional failure probabi…
▽ More
Design and operation of complex engineering systems rely on reliability optimization. Such optimization requires us to account for uncertainties expressed in terms of compli-cated, high-dimensional probability distributions, for which only samples or data might be available. However, using data or samples often degrades the computational efficiency, particularly as the conventional failure probability is estimated using the indicator function whose gradient is not defined at zero. To address this issue, by leveraging the buffered failure probability, the paper develops the buffered optimization and reliability method (BORM) for efficient, data-driven optimization of reliability. The proposed formulations, algo-rithms, and strategies greatly improve the computational efficiency of the optimization and thereby address the needs of high-dimensional and nonlinear problems. In addition, an analytical formula is developed to estimate the reliability sensitivity, a subject fraught with difficulty when using the conventional failure probability. The buffered failure probability is thoroughly investigated in the context of many different distributions, leading to a novel measure of tail-heaviness called the buffered tail index. The efficiency and accuracy of the proposed optimization methodology are demonstrated by three numerical examples, which underline the unique advantages of the buffered failure probability for data-driven reliability analysis.
△ Less
Submitted 21 September, 2021; v1 submitted 22 July, 2021;
originally announced July 2021.
-
Good and Bad Optimization Models: Insights from Rockafellians
Authors:
Johannes O. Royset
Abstract:
A basic requirement for a mathematical model is often that its solution (output) shouldn't change much if the model's parameters (input) are perturbed. This is important because the exact values of parameters may not be known and one would like to avoid being mislead by an output obtained using incorrect values. Thus, it's rarely enough to address an application by formulating a model, solving the…
▽ More
A basic requirement for a mathematical model is often that its solution (output) shouldn't change much if the model's parameters (input) are perturbed. This is important because the exact values of parameters may not be known and one would like to avoid being mislead by an output obtained using incorrect values. Thus, it's rarely enough to address an application by formulating a model, solving the resulting optimization problem and presenting the solution as the answer. One would need to confirm that the model is suitable, i.e., "good," and this can, at least in part, be achieved by considering a family of optimization problems constructed by perturbing parameters of concern. The resulting sensitivity analysis uncovers troubling situations with unstable solutions, which we referred to as "bad" models, and indicates better model formulations. Embedding an actual problem of interest within a family of problems is also a primary path to optimality conditions as well as computationally attractive, alternative problems, which under ideal circumstances, and when properly tuned, may even furnish the minimum value of the actual problem. The tuning of these alternative problems turns out to be intimately tied to finding multipliers in optimality conditions and thus emerges as a main component of several optimization algorithms. In fact, the tuning amounts to solving certain dual optimization problems. In this tutorial, we'll discuss the opportunities and insights afforded by this broad perspective.
△ Less
Submitted 5 August, 2021; v1 submitted 13 May, 2021;
originally announced May 2021.
-
Certifiable Risk-Based Engineering Design Optimization
Authors:
Anirban Chaudhuri,
Boris Kramer,
Matthew Norton,
Johannes O. Royset,
Karen Willcox
Abstract:
Reliable, risk-averse design of complex engineering systems with optimized performance requires dealing with uncertainties. A conventional approach is to add safety margins to a design that was obtained from deterministic optimization. Safer engineering designs require appropriate cost and constraint function definitions that capture the \textit{risk} associated with unwanted system behavior in th…
▽ More
Reliable, risk-averse design of complex engineering systems with optimized performance requires dealing with uncertainties. A conventional approach is to add safety margins to a design that was obtained from deterministic optimization. Safer engineering designs require appropriate cost and constraint function definitions that capture the \textit{risk} associated with unwanted system behavior in the presence of uncertainties. The paper proposes two notions of certifiability. The first is based on accounting for the magnitude of failure to ensure data-informed conservativeness. The second is the ability to provide optimization convergence guarantees by preserving convexity. Satisfying these notions leads to \textit{certifiable} risk-based design optimization (CRiBDO). In the context of CRiBDO, risk measures based on superquantile (a.k.a.\ conditional value-at-risk) and buffered probability of failure are analyzed. CRiBDO is contrasted with reliability-based design optimization (RBDO), where uncertainties are accounted for via the probability of failure, through a structural and a thermal design problem. A reformulation of the short column structural design problem leading to a convex CRiBDO problem is presented. The CRiBDO formulations capture more information about the problem to assign the appropriate conservativeness, exhibit superior optimization convergence by preserving properties of underlying functions, and alleviate the adverse effects of choosing hard failure thresholds required in RBDO.
△ Less
Submitted 13 July, 2021; v1 submitted 13 January, 2021;
originally announced January 2021.
-
Set-Convergence and Its Application: A Tutorial
Authors:
Johannes O. Royset
Abstract:
Optimization problems, generalized equations, and the multitude of other variational problems invariably lead to the analysis of sets and set-valued map**s as well as their approximations. We review the central concept of set-convergence and explain its role in defining a notion of proximity between sets, especially for epigraphs of functions and graphs of set-valued map**s. The development le…
▽ More
Optimization problems, generalized equations, and the multitude of other variational problems invariably lead to the analysis of sets and set-valued map**s as well as their approximations. We review the central concept of set-convergence and explain its role in defining a notion of proximity between sets, especially for epigraphs of functions and graphs of set-valued map**s. The development leads to an approximation theory for optimization problems and generalized equations with profound consequences for the construction of algorithms. We also introduce the role of set-convergence in variational geometry and subdifferentiability with applications to optimality conditions. Examples illustrate the importance of set-convergence in stability analysis, error analysis, construction of algorithms, statistical estimation, and probability theory.
△ Less
Submitted 22 February, 2020;
originally announced February 2020.
-
Diametrical Risk Minimization: Theory and Computations
Authors:
Matthew Norton,
Johannes O. Royset
Abstract:
The theoretical and empirical performance of Empirical Risk Minimization (ERM) often suffers when loss functions are poorly behaved with large Lipschitz moduli and spurious sharp minimizers. We propose and analyze a counterpart to ERM called Diametrical Risk Minimization (DRM), which accounts for worst-case empirical risks within neighborhoods in parameter space. DRM has generalization bounds that…
▽ More
The theoretical and empirical performance of Empirical Risk Minimization (ERM) often suffers when loss functions are poorly behaved with large Lipschitz moduli and spurious sharp minimizers. We propose and analyze a counterpart to ERM called Diametrical Risk Minimization (DRM), which accounts for worst-case empirical risks within neighborhoods in parameter space. DRM has generalization bounds that are independent of Lipschitz moduli for convex as well as nonconvex problems and it can be implemented using a practical algorithm based on stochastic gradient descent. Numerical results illustrate the ability of DRM to find quality solutions with low generalization error in sharp empirical risk landscapes from benchmark neural network classification problems with corrupted labels.
△ Less
Submitted 7 July, 2021; v1 submitted 23 October, 2019;
originally announced October 2019.
-
Stability and Error Analysis for Optimization and Generalized Equations
Authors:
Johannes O. Royset
Abstract:
Stability and error analysis remain challenging for problems that lack regularity properties near solutions, are subject to large perturbations, and might be infinite dimensional. We consider nonconvex optimization and generalized equations defined on metric spaces and develop bounds on solution errors using the truncated Hausdorff distance applied to graphs and epigraphs of the underlying set-val…
▽ More
Stability and error analysis remain challenging for problems that lack regularity properties near solutions, are subject to large perturbations, and might be infinite dimensional. We consider nonconvex optimization and generalized equations defined on metric spaces and develop bounds on solution errors using the truncated Hausdorff distance applied to graphs and epigraphs of the underlying set-valued map**s and functions. In the process, we extend the calculus of such distances to cover compositions and other constructions that arise in nonconvex problems. The results are applied to constrained problems with feasible sets that might have empty interiors, solution of KKT systems, and optimality conditions for difference-of-convex functions and composite functions.
△ Less
Submitted 22 February, 2020; v1 submitted 20 March, 2019;
originally announced March 2019.
-
Approximations of Semicontinuous Functions with Applications to Stochastic Optimization and Statistical Estimation
Authors:
Johannes O. Royset
Abstract:
Upper semicontinuous (usc) functions arise in the analysis of maximization problems, distributionally robust optimization, and function identification, which includes many problems of nonparametric statistics. We establish that every usc function is the limit of a hypo-converging sequence of piecewise affine functions of the difference-of-max type and illustrate resulting algorithmic possibilities…
▽ More
Upper semicontinuous (usc) functions arise in the analysis of maximization problems, distributionally robust optimization, and function identification, which includes many problems of nonparametric statistics. We establish that every usc function is the limit of a hypo-converging sequence of piecewise affine functions of the difference-of-max type and illustrate resulting algorithmic possibilities in the context of approximate solution of infinite-dimensional optimization problems. In an effort to quantify the ease with which classes of usc functions can be approximated by finite collections, we provide upper and lower bounds on covering numbers for bounded sets of usc functions under the Attouch-Wets distance. The result is applied in the context of stochastic optimization problems defined over spaces of usc functions. We establish confidence regions for optimal solutions based on sample average approximations and examine the accompanying rates of convergence. Examples from nonparametric statistics illustrate the results.
△ Less
Submitted 5 July, 2019; v1 submitted 20 September, 2017;
originally announced September 2017.
-
Variational Analysis of Constrained M-Estimators
Authors:
Johannes O. Royset,
Roger J-B Wets
Abstract:
We propose a unified framework for establishing existence of nonparametric M-estimators, computing the corresponding estimates, and proving their strong consistency when the class of functions is exceptionally rich. In particular, the framework addresses situations where the class of functions is complex involving information and assumptions about shape, pointwise bounds, location of modes, height…
▽ More
We propose a unified framework for establishing existence of nonparametric M-estimators, computing the corresponding estimates, and proving their strong consistency when the class of functions is exceptionally rich. In particular, the framework addresses situations where the class of functions is complex involving information and assumptions about shape, pointwise bounds, location of modes, height at modes, location of level-sets, values of moments, size of subgradients, continuity, distance to a "prior" function, multivariate total positivity, and any combination of the above. The class might be engineered to perform well in a specific setting even in the presence of little data. The framework views the class of functions as a subset of a particular metric space of upper semicontinuous functions under the Attouch-Wets distance. In addition to allowing a systematic treatment of numerous M-estimators, the framework yields consistency of plug-in estimators of modes of densities, maximizers of regression functions, level-sets of classifiers, and related quantities, and also enables computation by means of approximating parametric classes. We establish consistency through a one-sided law of large numbers, here extended to sieves, that relaxes assumptions of uniform laws, while ensuring global approximations even under model misspecification.
△ Less
Submitted 9 September, 2019; v1 submitted 26 February, 2017;
originally announced February 2017.