-
A Nonparametric Approach with Marginals for Modeling Consumer Choice
Authors:
Yanqiu Ruan,
Xiaobo Li,
Karthyek Murthy,
Karthik Natarajan
Abstract:
Given data on the choices made by consumers for different offer sets, a key challenge is to develop parsimonious models that describe and predict consumer choice behavior while being amenable to prescriptive tasks such as pricing and assortment optimization. The marginal distribution model (MDM) is one such model, that requires only the specification of marginal distributions of the random utiliti…
▽ More
Given data on the choices made by consumers for different offer sets, a key challenge is to develop parsimonious models that describe and predict consumer choice behavior while being amenable to prescriptive tasks such as pricing and assortment optimization. The marginal distribution model (MDM) is one such model, that requires only the specification of marginal distributions of the random utilities. This paper aims to establish necessary and sufficient conditions for given choice data to be consistent with the MDM hypothesis, inspired by the utility of similar characterizations for the random utility model (RUM). This endeavor leads to an exact characterization of the set of choice probabilities that the MDM can represent. Verifying the consistency of choice data with this characterization is equivalent to solving a polynomial-sized linear program. Since the analogous verification task for RUM is computationally intractable and neither of these models subsumes the other, MDM is helpful in striking a balance between tractability and representational power. The characterization is convenient to be used with robust optimization for making data-driven sales and revenue predictions for new unseen assortments. When the choice data lacks consistency with the MDM hypothesis, finding the best-fitting MDM choice probabilities reduces to solving a mixed integer convex program. The results extend naturally to the case where the alternatives can be grouped based on the similarity of the marginal distributions of the utilities. Numerical experiments show that MDM provides better representational power and prediction accuracy than multinominal logit and significantly better computational performance than RUM.
△ Less
Submitted 24 July, 2023; v1 submitted 12 August, 2022;
originally announced August 2022.
-
Statistical Analysis of Wasserstein Distributionally Robust Estimators
Authors:
Jose Blanchet,
Karthyek Murthy,
Viet Anh Nguyen
Abstract:
We consider statistical methods which invoke a min-max distributionally robust formulation to extract good out-of-sample performance in data-driven optimization and learning problems. Acknowledging the distributional uncertainty in learning from limited samples, the min-max formulations introduce an adversarial inner player to explore unseen covariate data. The resulting Distributionally Robust Op…
▽ More
We consider statistical methods which invoke a min-max distributionally robust formulation to extract good out-of-sample performance in data-driven optimization and learning problems. Acknowledging the distributional uncertainty in learning from limited samples, the min-max formulations introduce an adversarial inner player to explore unseen covariate data. The resulting Distributionally Robust Optimization (DRO) formulations, which include Wasserstein DRO formulations (our main focus), are specified using optimal transportation phenomena. Upon describing how these infinite-dimensional min-max problems can be approached via a finite-dimensional dual reformulation, the tutorial moves into its main component, namely, explaining a generic recipe for optimally selecting the size of the adversary's budget. This is achieved by studying the limit behavior of an optimal transport projection formulation arising from an inquiry on the smallest confidence region that includes the unknown population risk minimizer. Incidentally, this systematic prescription coincides with those in specific examples in high-dimensional statistics and results in error bounds that are free from the curse of dimensions. Equipped with this prescription, we present a central limit theorem for the DRO estimator and provide a recipe for constructing compatible confidence regions that are useful for uncertainty quantification. The rest of the tutorial is devoted to insights into the nature of the optimizers selected by the min-max formulations and additional applications of optimal transport projections.
△ Less
Submitted 4 August, 2021;
originally announced August 2021.
-
Efficient Black-Box Importance Sampling for VaR and CVaR Estimation
Authors:
Anand Deo,
Karthyek Murthy
Abstract:
This paper considers Importance Sampling (IS) for the estimation of tail risks of a loss defined in terms of a sophisticated object such as a machine learning feature map or a mixed integer linear optimisation formulation. Assuming only black-box access to the loss and the distribution of the underlying random vector, the paper presents an efficient IS algorithm for estimating the Value at Risk an…
▽ More
This paper considers Importance Sampling (IS) for the estimation of tail risks of a loss defined in terms of a sophisticated object such as a machine learning feature map or a mixed integer linear optimisation formulation. Assuming only black-box access to the loss and the distribution of the underlying random vector, the paper presents an efficient IS algorithm for estimating the Value at Risk and Conditional Value at Risk. The key challenge in any IS procedure, namely, identifying an appropriate change-of-measure, is automated with a self-structuring IS transformation that learns and replicates the concentration properties of the conditional excess from less rare samples. The resulting estimators enjoy asymptotically optimal variance reduction when viewed in the logarithmic scale. Simulation experiments highlight the efficacy and practicality of the proposed scheme
△ Less
Submitted 15 June, 2021;
originally announced June 2021.
-
Testing Group Fairness via Optimal Transport Projections
Authors:
Nian Si,
Karthyek Murthy,
Jose Blanchet,
Viet Anh Nguyen
Abstract:
We present a statistical testing framework to detect if a given machine learning classifier fails to satisfy a wide range of group fairness notions. The proposed test is a flexible, interpretable, and statistically rigorous tool for auditing whether exhibited biases are intrinsic to the algorithm or due to the randomness in the data. The statistical challenges, which may arise from multiple impact…
▽ More
We present a statistical testing framework to detect if a given machine learning classifier fails to satisfy a wide range of group fairness notions. The proposed test is a flexible, interpretable, and statistically rigorous tool for auditing whether exhibited biases are intrinsic to the algorithm or due to the randomness in the data. The statistical challenges, which may arise from multiple impact criteria that define group fairness and which are discontinuous on model parameters, are conveniently tackled by projecting the empirical measure onto the set of group-fair probability models using optimal transport. This statistic is efficiently computed using linear programming and its asymptotic distribution is explicitly obtained. The proposed framework can also be used to test for testing composite fairness hypotheses and fairness with multiple sensitive attributes. The optimal transport testing formulation improves interpretability by characterizing the minimal covariate perturbations that eliminate the bias observed in the audit.
△ Less
Submitted 2 June, 2021;
originally announced June 2021.
-
Achieving Efficiency in Black Box Simulation of Distribution Tails with Self-structuring Importance Samplers
Authors:
Anand Deo,
Karthyek Murthy
Abstract:
This paper presents a novel Importance Sampling (IS) scheme for estimating distribution tails of performance measures modeled with a rich set of tools such as linear programs, integer linear programs, piecewise linear/quadratic objectives, feature maps specified with deep neural networks, etc. The conventional approach of explicitly identifying efficient changes of measure suffers from feasibility…
▽ More
This paper presents a novel Importance Sampling (IS) scheme for estimating distribution tails of performance measures modeled with a rich set of tools such as linear programs, integer linear programs, piecewise linear/quadratic objectives, feature maps specified with deep neural networks, etc. The conventional approach of explicitly identifying efficient changes of measure suffers from feasibility and scalability concerns beyond highly stylized models, due to their need to be tailored intricately to the objective and the underlying probability distribution. This bottleneck is overcome in the proposed scheme with an elementary transformation which is capable of implicitly inducing an effective IS distribution in a variety of models by replicating the concentration properties observed in less rare samples. This novel approach is guided by develo** a large deviations principle that brings out the phenomenon of self-similarity of optimal IS distributions. The proposed sampler is the first to attain asymptotically optimal variance reduction across a spectrum of multivariate distributions despite being oblivious to the specifics of the underlying model. Its applicability is illustrated with contextual shortest path and portfolio credit risk models informed by neural networks
△ Less
Submitted 8 July, 2023; v1 submitted 13 February, 2021;
originally announced February 2021.
-
Confidence Regions in Wasserstein Distributionally Robust Estimation
Authors:
Jose Blanchet,
Karthyek Murthy,
Nian Si
Abstract:
Wasserstein distributionally robust optimization estimators are obtained as solutions of min-max problems in which the statistician selects a parameter minimizing the worst-case loss among all probability models within a certain distance (in a Wasserstein sense) from the underlying empirical measure. While motivated by the need to identify optimal model parameters or decision choices that are robu…
▽ More
Wasserstein distributionally robust optimization estimators are obtained as solutions of min-max problems in which the statistician selects a parameter minimizing the worst-case loss among all probability models within a certain distance (in a Wasserstein sense) from the underlying empirical measure. While motivated by the need to identify optimal model parameters or decision choices that are robust to model misspecification, these distributionally robust estimators recover a wide range of regularized estimators, including square-root lasso and support vector machines, among others, as particular cases. This paper studies the asymptotic normality of these distributionally robust estimators as well as the properties of an optimal (in a suitable sense) confidence region induced by the Wasserstein distributionally robust optimization formulation. In addition, key properties of min-max distributionally robust optimization problems are also studied, for example, we show that distributionally robust estimators regularize the loss based on its derivative and we also derive general sufficient conditions which show the equivalence between the min-max distributionally robust optimization problem and the corresponding max-min formulation.
△ Less
Submitted 3 March, 2021; v1 submitted 4 June, 2019;
originally announced June 2019.
-
Generalized Attracting Horseshoe in the Rössler Attractor
Authors:
Karthik Murthy,
Parth Sojitra,
Aminur Rahman,
Ian Jordan,
Denis Blackmore
Abstract:
We show that there is a mildly nonlinear three-dimensional system of ordinary differential equations - realizable by a rather simple electronic circuit - capable of producing a generalized attracting horseshoe map. A system specifically designed to have a Poincaré section yielding the desired map is described, but not pursued due to its complexity, which makes the construction of a circuit realiza…
▽ More
We show that there is a mildly nonlinear three-dimensional system of ordinary differential equations - realizable by a rather simple electronic circuit - capable of producing a generalized attracting horseshoe map. A system specifically designed to have a Poincaré section yielding the desired map is described, but not pursued due to its complexity, which makes the construction of a circuit realization exceedingly difficult. Instead, the generalized attracting horseshoe and its trap** region is obtained by using a carefully chosen Poincaré map of the Rössler attractor. Novel numerical techniques are employed to iterate the map of the trap** region to approximate the chaotic strange attractor contained in the generalized attracting horseshoe, and an electronic circuit is constructed to produce the map. Several potential applications of the idea of a generalized attracting horseshoe and a physical electronic circuit realization are proposed.
△ Less
Submitted 15 November, 2018;
originally announced November 2018.
-
Exploiting Partial Correlations in Distributionally Robust Optimization
Authors:
Divya Padmanabhan,
Karthik Natarajan,
Karthyek R. A. Murthy
Abstract:
In this paper, we identify partial correlation information structures that allow for simpler reformulations in evaluating the maximum expected value of mixed integer linear programs with random objective coefficients. To this end, assuming only the knowledge of the mean and the covariance matrix entries restricted to block-diagonal patterns, we develop a reduced semidefinite programming formulatio…
▽ More
In this paper, we identify partial correlation information structures that allow for simpler reformulations in evaluating the maximum expected value of mixed integer linear programs with random objective coefficients. To this end, assuming only the knowledge of the mean and the covariance matrix entries restricted to block-diagonal patterns, we develop a reduced semidefinite programming formulation, the complexity of solving which is related to characterizing a suitable projection of the convex hull of the set $\{(\bold{x}, \bold{x}\bold{x}'): \bold{x} \in \mathcal{X}\}$ where $\mathcal{X}$ is the feasible region. In some cases, this lends itself to efficient representations that result in polynomial-time solvable instances, most notably for the distributionally robust appointment scheduling problem with random job durations as well as for computing tight bounds in Project Evaluation and Review Technique (PERT) networks and linear assignment problems. To the best of our knowledge, this is the first example of a distributionally robust optimization formulation for appointment scheduling that permits a tight polynomial-time solvable semidefinite programming reformulation which explicitly captures partially known correlation information between uncertain processing times of the jobs to be scheduled.
△ Less
Submitted 23 October, 2018;
originally announced October 2018.
-
Optimal Transport Based Distributionally Robust Optimization: Structural Properties and Iterative Schemes
Authors:
Jose Blanchet,
Karthyek Murthy,
Fan Zhang
Abstract:
We consider optimal transport based distributionally robust optimization (DRO) problems with locally strongly convex transport cost functions and affine decision rules. Under conventional convexity assumptions on the underlying loss function, we obtain structural results about the value function, the optimal policy, and the worst-case optimal transport adversarial model. These results expose a ric…
▽ More
We consider optimal transport based distributionally robust optimization (DRO) problems with locally strongly convex transport cost functions and affine decision rules. Under conventional convexity assumptions on the underlying loss function, we obtain structural results about the value function, the optimal policy, and the worst-case optimal transport adversarial model. These results expose a rich structure embedded in the DRO problem (e.g. strong convexity even if the non-DRO problem was not strongly convex, a suitable scaling of the Lagrangian for the DRO constraint, etc. which are crucial for the design of efficient algorithms). As a consequence of these results, one can develop efficient optimization procedures which have the same sample and iteration complexity as a natural non-DRO benchmark algorithm such as stochastic gradient descent.
△ Less
Submitted 25 April, 2021; v1 submitted 4 October, 2018;
originally announced October 2018.
-
Robust Wasserstein Profile Inference and Applications to Machine Learning
Authors:
Jose Blanchet,
Yang Kang,
Karthyek Murthy
Abstract:
We show that several machine learning estimators, including square-root LASSO (Least Absolute Shrinkage and Selection) and regularized logistic regression can be represented as solutions to distributionally robust optimization (DRO) problems. The associated uncertainty regions are based on suitably defined Wasserstein distances. Hence, our representations allow us to view regularization as a resul…
▽ More
We show that several machine learning estimators, including square-root LASSO (Least Absolute Shrinkage and Selection) and regularized logistic regression can be represented as solutions to distributionally robust optimization (DRO) problems. The associated uncertainty regions are based on suitably defined Wasserstein distances. Hence, our representations allow us to view regularization as a result of introducing an artificial adversary that perturbs the empirical distribution to account for out-of-sample effects in loss estimation. In addition, we introduce RWPI (Robust Wasserstein Profile Inference), a novel inference methodology which extends the use of methods inspired by Empirical Likelihood to the setting of optimal transport costs (of which Wasserstein distances are a particular case). We use RWPI to show how to optimally select the size of uncertainty regions, and as a consequence, we are able to choose regularization parameters for these machine learning estimators without the use of cross validation. Numerical experiments are also given to validate our theoretical findings.
△ Less
Submitted 21 October, 2020; v1 submitted 18 October, 2016;
originally announced October 2016.
-
Exact and efficient simulation of tail probabilities of heavy-tailed infinite series
Authors:
Henrik Hult,
Sandeep Juneja,
Karthyek Murthy
Abstract:
We develop an efficient simulation algorithm for computing the tail probabilities of the infinite series $S = \sum_{n \geq 1} a_n X_n$ when random variables $X_n$ are heavy-tailed. As $S$ is the sum of infinitely many random variables, any simulation algorithm that stops after simulating only fixed, finitely many random variables is likely to introduce a bias. We overcome this challenge by rewriti…
▽ More
We develop an efficient simulation algorithm for computing the tail probabilities of the infinite series $S = \sum_{n \geq 1} a_n X_n$ when random variables $X_n$ are heavy-tailed. As $S$ is the sum of infinitely many random variables, any simulation algorithm that stops after simulating only fixed, finitely many random variables is likely to introduce a bias. We overcome this challenge by rewriting the tail probability of interest as a sum of a random number of telesco** terms, and subsequently develo** conditional Monte Carlo based low variance simulation estimators for each telesco** term. The resulting algorithm is proved to result in estimators that a) have no bias, and b) require only a fixed, finite number of replications irrespective of how rare the tail probability of interest is. Thus, by combining a traditional variance reduction technique such as conditional Monte Carlo with more recent use of auxiliary randomization to remove bias in a multi-level type representation, we develop an efficient and unbiased simulation algorithm for tail probabilities of $S$. These have many applications including in analysis of financial time-series and stochastic recurrence equations arising in models in actuarial risk and population biology.
△ Less
Submitted 6 September, 2016;
originally announced September 2016.
-
Quantifying Distributional Model Risk via Optimal Transport
Authors:
Jose Blanchet,
Karthyek R. A. Murthy
Abstract:
This paper deals with the problem of quantifying the impact of model misspecification when computing general expected values of interest. The methodology that we propose is applicable in great generality, in particular, we provide examples involving path dependent expectations of stochastic processes. Our approach consists in computing bounds for the expectation of interest regardless of the proba…
▽ More
This paper deals with the problem of quantifying the impact of model misspecification when computing general expected values of interest. The methodology that we propose is applicable in great generality, in particular, we provide examples involving path dependent expectations of stochastic processes. Our approach consists in computing bounds for the expectation of interest regardless of the probability measure used, as long as the measure lies within a prescribed tolerance measured in terms of a flexible class of distances from a suitable baseline model. These distances, based on optimal transportation between probability measures, include Wasserstein's distances as particular cases. The proposed methodology is well-suited for risk analysis, as we demonstrate with a number of applications. We also discuss how to estimate the tolerance region non-parametrically using Skorokhod-type embeddings in some of these applications.
△ Less
Submitted 1 July, 2017; v1 submitted 5 April, 2016;
originally announced April 2016.
-
On distributionally robust extreme value analysis
Authors:
Jose Blanchet,
Fei He,
Karthyek R. A. Murthy
Abstract:
We study distributional robustness in the context of Extreme Value Theory (EVT). We provide a data-driven method for estimating extreme quantiles in a manner that is robust against incorrect model assumptions underlying the application of the standard Extremal Types Theorem. Typical studies in distributional robustness involve computing worst case estimates over a model uncertainty region expresse…
▽ More
We study distributional robustness in the context of Extreme Value Theory (EVT). We provide a data-driven method for estimating extreme quantiles in a manner that is robust against incorrect model assumptions underlying the application of the standard Extremal Types Theorem. Typical studies in distributional robustness involve computing worst case estimates over a model uncertainty region expressed in terms of the Kullback-Leibler discrepancy. We go beyond standard distributional robustness in that we investigate different forms of discrepancies, and prove rigorous results which are helpful for understanding the role of a putative model uncertainty region in the context of extreme quantile estimation. Finally, we illustrate our data-driven method in various settings, including examples showing how standard EVT can significantly underestimate quantiles of interest.
△ Less
Submitted 6 June, 2020; v1 submitted 25 January, 2016;
originally announced January 2016.
-
Tail Asymptotics for Delay in a Half-loaded GI/GI/2 Queue with Heavy-tailed Job Sizes
Authors:
Jose Blanchet,
Karthyek Murthy
Abstract:
We obtain asymptotic bounds for the tail distribution of steady-state waiting time in a two server queue where each server processes incoming jobs at a rate equal to the rate of their arrivals (that is, the half-loaded regime). The job sizes are taken to be regularly varying. When the incoming jobs have finite variance, there are basically two types of effects that dominate the tail asymptotics. W…
▽ More
We obtain asymptotic bounds for the tail distribution of steady-state waiting time in a two server queue where each server processes incoming jobs at a rate equal to the rate of their arrivals (that is, the half-loaded regime). The job sizes are taken to be regularly varying. When the incoming jobs have finite variance, there are basically two types of effects that dominate the tail asymptotics. While the quantitative distinction between these two manifests itself only in the slowly varying components, the two effects arise from qualitatively very different phenomena (arrival of one extremely big job (or) two big jobs). Then there is a phase transition that occurs when the incoming jobs have infinite variance. In that case, only one of these effects dominate the tail asymptotics, the one involving arrival of one extremely big job.
△ Less
Submitted 16 February, 2015;
originally announced February 2015.
-
Exact Simulation of Multidimensional Reflected Brownian Motion
Authors:
Jose Blanchet,
Karthyek R. A. Murthy
Abstract:
We present the first exact simulation method for multidimensional reflected Brownian motion (RBM). Exact simulation in this setting is challenging because of the presence of correlated local-time-like terms in the definition of RBM. We apply recently developed so-called $\varepsilon-$strong simulation techniques (also known as Tolerance-Enforced Simulation) which allow us to provide a piece-wise l…
▽ More
We present the first exact simulation method for multidimensional reflected Brownian motion (RBM). Exact simulation in this setting is challenging because of the presence of correlated local-time-like terms in the definition of RBM. We apply recently developed so-called $\varepsilon-$strong simulation techniques (also known as Tolerance-Enforced Simulation) which allow us to provide a piece-wise linear approximation to RBM with $\varepsilon $ (deterministic) error in uniform norm. A novel conditional acceptance/rejection step is then used to eliminate the error. In particular, we condition on a suitably designed information structure so that a feasible proposal distribution can be applied.
△ Less
Submitted 30 August, 2017; v1 submitted 26 May, 2014;
originally announced May 2014.
-
State-independent Importance Sampling for Random Walks with Regularly Varying Increments
Authors:
Karthyek R. A. Murthy,
Sandeep Juneja,
Jose Blanchet
Abstract:
We develop importance sampling based efficient simulation techniques for three commonly encountered rare event probabilities associated with random walks having i.i.d. regularly varying increments; namely, 1) the large deviation probabilities, 2) the level crossing probabilities, and 3) the level crossing probabilities within a regenerative cycle. Exponential twisting based state-independent metho…
▽ More
We develop importance sampling based efficient simulation techniques for three commonly encountered rare event probabilities associated with random walks having i.i.d. regularly varying increments; namely, 1) the large deviation probabilities, 2) the level crossing probabilities, and 3) the level crossing probabilities within a regenerative cycle. Exponential twisting based state-independent methods, which are effective in efficiently estimating these probabilities for light-tailed increments are not applicable when the increments are heavy-tailed. To address the latter case, more complex and elegant state-dependent efficient simulation algorithms have been developed in the literature over the last few years. We propose that by suitably decomposing these rare event probabilities into a dominant and further residual components, simpler state-independent importance sampling algorithms can be devised for each component resulting in composite unbiased estimators with desirable efficiency properties. When the increments have infinite variance, there is an added complexity in estimating the level crossing probabilities as even the well known zero-variance measures have an infinite expected termination time. We adapt our algorithms so that this expectation is finite while the estimators remain strongly efficient. Numerically, the proposed estimators perform at least as well, and sometimes substantially better than the existing state-dependent estimators in the literature.
△ Less
Submitted 27 September, 2014; v1 submitted 15 June, 2012;
originally announced June 2012.