-
Importance Sampling for Minimization of Tail Risks: A Tutorial
Authors:
Anand Deo,
Karthyek Murthy
Abstract:
This paper provides an introductory overview of how one may employ importance sampling effectively as a tool for solving stochastic optimization formulations incorporating tail risk measures such as Conditional Value-at-Risk. Approximating the tail risk measure by its sample average approximation, while appealing due to its simplicity and universality in use, requires a large number of samples to…
▽ More
This paper provides an introductory overview of how one may employ importance sampling effectively as a tool for solving stochastic optimization formulations incorporating tail risk measures such as Conditional Value-at-Risk. Approximating the tail risk measure by its sample average approximation, while appealing due to its simplicity and universality in use, requires a large number of samples to be able to arrive at risk-minimizing decisions with high confidence. This is primarily due to the rarity with which the relevant tail events get observed in the samples. In simulation, Importance Sampling is among the most prominent methods for substantially reducing the sample requirement while estimating probabilities of rare events. Can importance sampling be used for optimization as well? If so, what are the ingredients required for making importance sampling an effective tool for optimization formulations involving rare events? This tutorial aims to provide an introductory overview of the two key ingredients in this regard, namely, (i) how one may arrive at an importance sampling change of measure prescription at every decision, and (ii) the prominent techniques available for integrating such a prescription within a solution paradigm for stochastic optimization formulations.
△ Less
Submitted 10 July, 2023;
originally announced July 2023.
-
A Nonparametric Approach with Marginals for Modeling Consumer Choice
Authors:
Yanqiu Ruan,
Xiaobo Li,
Karthyek Murthy,
Karthik Natarajan
Abstract:
Given data on the choices made by consumers for different offer sets, a key challenge is to develop parsimonious models that describe and predict consumer choice behavior while being amenable to prescriptive tasks such as pricing and assortment optimization. The marginal distribution model (MDM) is one such model, that requires only the specification of marginal distributions of the random utiliti…
▽ More
Given data on the choices made by consumers for different offer sets, a key challenge is to develop parsimonious models that describe and predict consumer choice behavior while being amenable to prescriptive tasks such as pricing and assortment optimization. The marginal distribution model (MDM) is one such model, that requires only the specification of marginal distributions of the random utilities. This paper aims to establish necessary and sufficient conditions for given choice data to be consistent with the MDM hypothesis, inspired by the utility of similar characterizations for the random utility model (RUM). This endeavor leads to an exact characterization of the set of choice probabilities that the MDM can represent. Verifying the consistency of choice data with this characterization is equivalent to solving a polynomial-sized linear program. Since the analogous verification task for RUM is computationally intractable and neither of these models subsumes the other, MDM is helpful in striking a balance between tractability and representational power. The characterization is convenient to be used with robust optimization for making data-driven sales and revenue predictions for new unseen assortments. When the choice data lacks consistency with the MDM hypothesis, finding the best-fitting MDM choice probabilities reduces to solving a mixed integer convex program. The results extend naturally to the case where the alternatives can be grouped based on the similarity of the marginal distributions of the utilities. Numerical experiments show that MDM provides better representational power and prediction accuracy than multinominal logit and significantly better computational performance than RUM.
△ Less
Submitted 24 July, 2023; v1 submitted 12 August, 2022;
originally announced August 2022.
-
Combining Retrospective Approximation with Importance Sampling for Optimising Conditional Value at Risk
Authors:
Anand Deo,
Karthyek Murthy,
Tirtho Sarker
Abstract:
This paper investigates the use of retrospective approximation solution paradigm in solving risk-averse optimization problems effectively via importance sampling (IS). While IS serves as a prominent means for tackling the large sample requirements in estimating tail risk measures such as Conditional Value at Risk (CVaR), its use in optimization problems driven by CVaR is complicated by the need to…
▽ More
This paper investigates the use of retrospective approximation solution paradigm in solving risk-averse optimization problems effectively via importance sampling (IS). While IS serves as a prominent means for tackling the large sample requirements in estimating tail risk measures such as Conditional Value at Risk (CVaR), its use in optimization problems driven by CVaR is complicated by the need to tailor the IS change of measure differently to different optimization iterates and the circularity which arises as a consequence. The proposed algorithm overcomes these challenges by employing a univariate IS transformation offering uniform variance reduction in a retrospective approximation procedure well-suited for tuning the IS parameter choice. The resulting simulation based approximation scheme enjoys both the computational efficiency bestowed by retrospective approximation and logarithmically efficient variance reduction offered by importance sampling
△ Less
Submitted 26 June, 2022;
originally announced June 2022.
-
Statistical Analysis of Wasserstein Distributionally Robust Estimators
Authors:
Jose Blanchet,
Karthyek Murthy,
Viet Anh Nguyen
Abstract:
We consider statistical methods which invoke a min-max distributionally robust formulation to extract good out-of-sample performance in data-driven optimization and learning problems. Acknowledging the distributional uncertainty in learning from limited samples, the min-max formulations introduce an adversarial inner player to explore unseen covariate data. The resulting Distributionally Robust Op…
▽ More
We consider statistical methods which invoke a min-max distributionally robust formulation to extract good out-of-sample performance in data-driven optimization and learning problems. Acknowledging the distributional uncertainty in learning from limited samples, the min-max formulations introduce an adversarial inner player to explore unseen covariate data. The resulting Distributionally Robust Optimization (DRO) formulations, which include Wasserstein DRO formulations (our main focus), are specified using optimal transportation phenomena. Upon describing how these infinite-dimensional min-max problems can be approached via a finite-dimensional dual reformulation, the tutorial moves into its main component, namely, explaining a generic recipe for optimally selecting the size of the adversary's budget. This is achieved by studying the limit behavior of an optimal transport projection formulation arising from an inquiry on the smallest confidence region that includes the unknown population risk minimizer. Incidentally, this systematic prescription coincides with those in specific examples in high-dimensional statistics and results in error bounds that are free from the curse of dimensions. Equipped with this prescription, we present a central limit theorem for the DRO estimator and provide a recipe for constructing compatible confidence regions that are useful for uncertainty quantification. The rest of the tutorial is devoted to insights into the nature of the optimizers selected by the min-max formulations and additional applications of optimal transport projections.
△ Less
Submitted 4 August, 2021;
originally announced August 2021.
-
Efficient Black-Box Importance Sampling for VaR and CVaR Estimation
Authors:
Anand Deo,
Karthyek Murthy
Abstract:
This paper considers Importance Sampling (IS) for the estimation of tail risks of a loss defined in terms of a sophisticated object such as a machine learning feature map or a mixed integer linear optimisation formulation. Assuming only black-box access to the loss and the distribution of the underlying random vector, the paper presents an efficient IS algorithm for estimating the Value at Risk an…
▽ More
This paper considers Importance Sampling (IS) for the estimation of tail risks of a loss defined in terms of a sophisticated object such as a machine learning feature map or a mixed integer linear optimisation formulation. Assuming only black-box access to the loss and the distribution of the underlying random vector, the paper presents an efficient IS algorithm for estimating the Value at Risk and Conditional Value at Risk. The key challenge in any IS procedure, namely, identifying an appropriate change-of-measure, is automated with a self-structuring IS transformation that learns and replicates the concentration properties of the conditional excess from less rare samples. The resulting estimators enjoy asymptotically optimal variance reduction when viewed in the logarithmic scale. Simulation experiments highlight the efficacy and practicality of the proposed scheme
△ Less
Submitted 15 June, 2021;
originally announced June 2021.
-
Testing Group Fairness via Optimal Transport Projections
Authors:
Nian Si,
Karthyek Murthy,
Jose Blanchet,
Viet Anh Nguyen
Abstract:
We present a statistical testing framework to detect if a given machine learning classifier fails to satisfy a wide range of group fairness notions. The proposed test is a flexible, interpretable, and statistically rigorous tool for auditing whether exhibited biases are intrinsic to the algorithm or due to the randomness in the data. The statistical challenges, which may arise from multiple impact…
▽ More
We present a statistical testing framework to detect if a given machine learning classifier fails to satisfy a wide range of group fairness notions. The proposed test is a flexible, interpretable, and statistically rigorous tool for auditing whether exhibited biases are intrinsic to the algorithm or due to the randomness in the data. The statistical challenges, which may arise from multiple impact criteria that define group fairness and which are discontinuous on model parameters, are conveniently tackled by projecting the empirical measure onto the set of group-fair probability models using optimal transport. This statistic is efficiently computed using linear programming and its asymptotic distribution is explicitly obtained. The proposed framework can also be used to test for testing composite fairness hypotheses and fairness with multiple sensitive attributes. The optimal transport testing formulation improves interpretability by characterizing the minimal covariate perturbations that eliminate the bias observed in the audit.
△ Less
Submitted 2 June, 2021;
originally announced June 2021.
-
Achieving Efficiency in Black Box Simulation of Distribution Tails with Self-structuring Importance Samplers
Authors:
Anand Deo,
Karthyek Murthy
Abstract:
This paper presents a novel Importance Sampling (IS) scheme for estimating distribution tails of performance measures modeled with a rich set of tools such as linear programs, integer linear programs, piecewise linear/quadratic objectives, feature maps specified with deep neural networks, etc. The conventional approach of explicitly identifying efficient changes of measure suffers from feasibility…
▽ More
This paper presents a novel Importance Sampling (IS) scheme for estimating distribution tails of performance measures modeled with a rich set of tools such as linear programs, integer linear programs, piecewise linear/quadratic objectives, feature maps specified with deep neural networks, etc. The conventional approach of explicitly identifying efficient changes of measure suffers from feasibility and scalability concerns beyond highly stylized models, due to their need to be tailored intricately to the objective and the underlying probability distribution. This bottleneck is overcome in the proposed scheme with an elementary transformation which is capable of implicitly inducing an effective IS distribution in a variety of models by replicating the concentration properties observed in less rare samples. This novel approach is guided by develo** a large deviations principle that brings out the phenomenon of self-similarity of optimal IS distributions. The proposed sampler is the first to attain asymptotically optimal variance reduction across a spectrum of multivariate distributions despite being oblivious to the specifics of the underlying model. Its applicability is illustrated with contextual shortest path and portfolio credit risk models informed by neural networks
△ Less
Submitted 8 July, 2023; v1 submitted 13 February, 2021;
originally announced February 2021.
-
Optimizing tail risks using an importance sampling based extrapolation for heavy-tailed objectives
Authors:
Anand Deo,
Karthyek Murthy
Abstract:
Motivated by the prominence of Conditional Value-at-Risk (CVaR) as a measure for tail risk in settings affected by uncertainty, we develop a new formula for approximating CVaR based optimization objectives and their gradients from limited samples. A key difficulty that limits the widespread practical use of these optimization formulations is the large amount of data required by the state-of-the-ar…
▽ More
Motivated by the prominence of Conditional Value-at-Risk (CVaR) as a measure for tail risk in settings affected by uncertainty, we develop a new formula for approximating CVaR based optimization objectives and their gradients from limited samples. A key difficulty that limits the widespread practical use of these optimization formulations is the large amount of data required by the state-of-the-art sample average approximation schemes to approximate the CVaR objective with high fidelity. Unlike the state-of-the-art sample average approximations which require impractically large amounts of data in tail probability regions, the proposed approximation scheme exploits the self-similarity of heavy-tailed distributions to extrapolate data from suitable lower quantiles. The resulting approximations are shown to be statistically consistent and are amenable for optimization by means of conventional gradient descent. The approximation is guided by means of a systematic importance-sampling scheme whose asymptotic variance reduction properties are rigorously examined. Numerical experiments demonstrate the superiority of the proposed approximations and the ease of implementation points to the versatility of settings to which the approximation scheme can be applied.
△ Less
Submitted 22 August, 2020;
originally announced August 2020.
-
Confidence Regions in Wasserstein Distributionally Robust Estimation
Authors:
Jose Blanchet,
Karthyek Murthy,
Nian Si
Abstract:
Wasserstein distributionally robust optimization estimators are obtained as solutions of min-max problems in which the statistician selects a parameter minimizing the worst-case loss among all probability models within a certain distance (in a Wasserstein sense) from the underlying empirical measure. While motivated by the need to identify optimal model parameters or decision choices that are robu…
▽ More
Wasserstein distributionally robust optimization estimators are obtained as solutions of min-max problems in which the statistician selects a parameter minimizing the worst-case loss among all probability models within a certain distance (in a Wasserstein sense) from the underlying empirical measure. While motivated by the need to identify optimal model parameters or decision choices that are robust to model misspecification, these distributionally robust estimators recover a wide range of regularized estimators, including square-root lasso and support vector machines, among others, as particular cases. This paper studies the asymptotic normality of these distributionally robust estimators as well as the properties of an optimal (in a suitable sense) confidence region induced by the Wasserstein distributionally robust optimization formulation. In addition, key properties of min-max distributionally robust optimization problems are also studied, for example, we show that distributionally robust estimators regularize the loss based on its derivative and we also derive general sufficient conditions which show the equivalence between the min-max distributionally robust optimization problem and the corresponding max-min formulation.
△ Less
Submitted 3 March, 2021; v1 submitted 4 June, 2019;
originally announced June 2019.
-
Deep Active Localization
Authors:
Sai Krishna,
Keehong Seo,
Dhaivat Bhatt,
Vincent Mai,
Krishna Murthy,
Liam Paull
Abstract:
Active localization is the problem of generating robot actions that allow it to maximally disambiguate its pose within a reference map. Traditional approaches to this use an information-theoretic criterion for action selection and hand-crafted perceptual models. In this work we propose an end-to-end differentiable method for learning to take informative actions that is trainable entirely in simula…
▽ More
Active localization is the problem of generating robot actions that allow it to maximally disambiguate its pose within a reference map. Traditional approaches to this use an information-theoretic criterion for action selection and hand-crafted perceptual models. In this work we propose an end-to-end differentiable method for learning to take informative actions that is trainable entirely in simulation and then transferable to real robot hardware with zero refinement. The system is composed of two modules: a convolutional neural network for perception, and a deep reinforcement learned planning module. We introduce a multi-scale approach to the learned perceptual model since the accuracy needed to perform action selection with reinforcement learning is much less than the accuracy needed for robot control. We demonstrate that the resulting system outperforms using the traditional approach for either perception or planning. We also demonstrate our approaches robustness to different map configurations and other nuisance parameters through the use of domain randomization in training. The code is also compatible with the OpenAI gym framework, as well as the Gazebo simulator.
△ Less
Submitted 5 March, 2019;
originally announced March 2019.
-
Data-driven Optimal Cost Selection for Distributionally Robust Optimization
Authors:
Jose Blanchet,
Yang Kang,
Fan Zhang,
Karthyek Murthy
Abstract:
Recently, (Blanchet, Kang, and Murhy 2016, and Blanchet, and Kang 2017) showed that several machine learning algorithms, such as square-root Lasso, Support Vector Machines, and regularized logistic regression, among many others, can be represented exactly as distributionally robust optimization (DRO) problems. The distributional uncertainty is defined as a neighborhood centered at the empirical di…
▽ More
Recently, (Blanchet, Kang, and Murhy 2016, and Blanchet, and Kang 2017) showed that several machine learning algorithms, such as square-root Lasso, Support Vector Machines, and regularized logistic regression, among many others, can be represented exactly as distributionally robust optimization (DRO) problems. The distributional uncertainty is defined as a neighborhood centered at the empirical distribution. We propose a methodology which learns such neighborhood in a natural data-driven way. We show rigorously that our framework encompasses adaptive regularization as a particular case. Moreover, we demonstrate empirically that our proposed methodology is able to improve upon a wide range of popular machine learning estimators.
△ Less
Submitted 29 March, 2019; v1 submitted 19 May, 2017;
originally announced May 2017.
-
Exact and efficient simulation of tail probabilities of heavy-tailed infinite series
Authors:
Henrik Hult,
Sandeep Juneja,
Karthyek Murthy
Abstract:
We develop an efficient simulation algorithm for computing the tail probabilities of the infinite series $S = \sum_{n \geq 1} a_n X_n$ when random variables $X_n$ are heavy-tailed. As $S$ is the sum of infinitely many random variables, any simulation algorithm that stops after simulating only fixed, finitely many random variables is likely to introduce a bias. We overcome this challenge by rewriti…
▽ More
We develop an efficient simulation algorithm for computing the tail probabilities of the infinite series $S = \sum_{n \geq 1} a_n X_n$ when random variables $X_n$ are heavy-tailed. As $S$ is the sum of infinitely many random variables, any simulation algorithm that stops after simulating only fixed, finitely many random variables is likely to introduce a bias. We overcome this challenge by rewriting the tail probability of interest as a sum of a random number of telesco** terms, and subsequently develo** conditional Monte Carlo based low variance simulation estimators for each telesco** term. The resulting algorithm is proved to result in estimators that a) have no bias, and b) require only a fixed, finite number of replications irrespective of how rare the tail probability of interest is. Thus, by combining a traditional variance reduction technique such as conditional Monte Carlo with more recent use of auxiliary randomization to remove bias in a multi-level type representation, we develop an efficient and unbiased simulation algorithm for tail probabilities of $S$. These have many applications including in analysis of financial time-series and stochastic recurrence equations arising in models in actuarial risk and population biology.
△ Less
Submitted 6 September, 2016;
originally announced September 2016.
-
On distributionally robust extreme value analysis
Authors:
Jose Blanchet,
Fei He,
Karthyek R. A. Murthy
Abstract:
We study distributional robustness in the context of Extreme Value Theory (EVT). We provide a data-driven method for estimating extreme quantiles in a manner that is robust against incorrect model assumptions underlying the application of the standard Extremal Types Theorem. Typical studies in distributional robustness involve computing worst case estimates over a model uncertainty region expresse…
▽ More
We study distributional robustness in the context of Extreme Value Theory (EVT). We provide a data-driven method for estimating extreme quantiles in a manner that is robust against incorrect model assumptions underlying the application of the standard Extremal Types Theorem. Typical studies in distributional robustness involve computing worst case estimates over a model uncertainty region expressed in terms of the Kullback-Leibler discrepancy. We go beyond standard distributional robustness in that we investigate different forms of discrepancies, and prove rigorous results which are helpful for understanding the role of a putative model uncertainty region in the context of extreme quantile estimation. Finally, we illustrate our data-driven method in various settings, including examples showing how standard EVT can significantly underestimate quantiles of interest.
△ Less
Submitted 6 June, 2020; v1 submitted 25 January, 2016;
originally announced January 2016.
-
Exact Simulation of Multidimensional Reflected Brownian Motion
Authors:
Jose Blanchet,
Karthyek R. A. Murthy
Abstract:
We present the first exact simulation method for multidimensional reflected Brownian motion (RBM). Exact simulation in this setting is challenging because of the presence of correlated local-time-like terms in the definition of RBM. We apply recently developed so-called $\varepsilon-$strong simulation techniques (also known as Tolerance-Enforced Simulation) which allow us to provide a piece-wise l…
▽ More
We present the first exact simulation method for multidimensional reflected Brownian motion (RBM). Exact simulation in this setting is challenging because of the presence of correlated local-time-like terms in the definition of RBM. We apply recently developed so-called $\varepsilon-$strong simulation techniques (also known as Tolerance-Enforced Simulation) which allow us to provide a piece-wise linear approximation to RBM with $\varepsilon $ (deterministic) error in uniform norm. A novel conditional acceptance/rejection step is then used to eliminate the error. In particular, we condition on a suitably designed information structure so that a feasible proposal distribution can be applied.
△ Less
Submitted 30 August, 2017; v1 submitted 26 May, 2014;
originally announced May 2014.