Search | arXiv e-print repository

Layered Graph Security Games

Authors: Jakub Černý, Chun Kai Ling, Christian Kroer, Garud Iyengar

Abstract: Security games model strategic interactions in adversarial real-world applications. Such applications often involve extremely large but highly structured strategy sets (e.g., selecting a distribution over all patrol routes in a given graph). In this paper, we represent each player's strategy space using a layered graph whose paths represent an exponentially large strategy space. Our formulation en… ▽ More Security games model strategic interactions in adversarial real-world applications. Such applications often involve extremely large but highly structured strategy sets (e.g., selecting a distribution over all patrol routes in a given graph). In this paper, we represent each player's strategy space using a layered graph whose paths represent an exponentially large strategy space. Our formulation entails not only classic pursuit-evasion games, but also other security games, such as those modeling anti-terrorism and logistical interdiction. We study two-player zero-sum games under two distinct utility models: linear and binary utilities. We show that under linear utilities, Nash equilibrium can be computed in polynomial time, while binary utilities may lead to situations where even computing a best-response is computationally intractable. To this end, we propose a practical algorithm based on incremental strategy generation and mixed integer linear programs. We show through extensive experiments that our algorithm efficiently computes $ε$-equilibrium for many games of interest. We find that target values and graph structure often have a larger influence on running times as compared to the size of the graph per se. △ Less

Submitted 9 May, 2024; v1 submitted 5 May, 2024; originally announced May 2024.

Comments: In Proceedings of the Thirty-Third International Joint Conference on Artificial Intelligence. IJCAI Press, 2024

arXiv:2401.06710 [pdf, other]

Model-Free Approximate Bayesian Learning for Large-Scale Conversion Funnel Optimization

Authors: Garud Iyengar, Raghav Singal

Abstract: The flexibility of choosing the ad action as a function of the consumer state is critical for modern-day marketing campaigns. We study the problem of identifying the optimal sequential personalized interventions that maximize the adoption probability for a new product. We model consumer behavior by a conversion funnel that captures the state of each consumer (e.g., interaction history with the fir… ▽ More The flexibility of choosing the ad action as a function of the consumer state is critical for modern-day marketing campaigns. We study the problem of identifying the optimal sequential personalized interventions that maximize the adoption probability for a new product. We model consumer behavior by a conversion funnel that captures the state of each consumer (e.g., interaction history with the firm) and allows the consumer behavior to vary as a function of both her state and firm's sequential interventions. We show our model captures consumer behavior with very high accuracy (out-of-sample AUC of over 0.95) in a real-world email marketing dataset. However, it results in a very large-scale learning problem, where the firm must learn the state-specific effects of various interventions from consumer interactions. We propose a novel attribution-based decision-making algorithm for this problem that we call model-free approximate Bayesian learning. Our algorithm inherits the interpretability and scalability of Thompson sampling for bandits and maintains an approximate belief over the value of each state-specific intervention. The belief is updated as the algorithm interacts with the consumers. Despite being an approximation to the Bayes update, we prove the asymptotic optimality of our algorithm and analyze its convergence rate. We show that our algorithm significantly outperforms traditional approaches on extensive simulations calibrated to a real-world email marketing dataset. △ Less

Submitted 12 January, 2024; originally announced January 2024.

arXiv:2312.01018 [pdf, other]

Decentralized Finance: Protocols, Risks, and Governance

Authors: Agostino Capponi, Garud Iyengar, Jay Sethuraman

Abstract: Financial markets are undergoing an unprecedented transformation. Technological advances have brought major improvements to the operations of financial services. While these advances promote improved accessibility and convenience, traditional finance shortcomings like lack of transparency and moral hazard frictions continue to plague centralized platforms, imposing societal costs. In this paper, w… ▽ More Financial markets are undergoing an unprecedented transformation. Technological advances have brought major improvements to the operations of financial services. While these advances promote improved accessibility and convenience, traditional finance shortcomings like lack of transparency and moral hazard frictions continue to plague centralized platforms, imposing societal costs. In this paper, we argue how these shortcomings and frictions are being mitigated by the decentralized finance (DeFi) ecosystem. We delve into the workings of smart contracts, the backbone of DeFi transactions, with an emphasis on those underpinning token exchange and lending services. We highlight the pros and cons of the novel form of decentralized governance introduced via the ownership of governance tokens. Despite its potential, the current DeFi infrastructure introduces operational risks to users, which we segment into five primary categories: consensus mechanisms, protocol, oracle, frontrunning, and systemic risks. We conclude by emphasizing the need for future research to focus on the scalability of existing blockchains, the improved design and interoperability of DeFi protocols, and the rigorous auditing of smart contracts. △ Less

Submitted 1 December, 2023; originally announced December 2023.

Comments: 2

arXiv:2310.15286 [pdf, other]

A Doubly Robust Approach to Sparse Reinforcement Learning

Authors: Wonyoung Kim, Garud Iyengar, Assaf Zeevi

Abstract: We propose a new regret minimization algorithm for episodic sparse linear Markov decision process (SMDP) where the state-transition distribution is a linear function of observed features. The only previously known algorithm for SMDP requires the knowledge of the sparsity parameter and oracle access to an unknown policy. We overcome these limitations by combining the doubly robust method that allow… ▽ More We propose a new regret minimization algorithm for episodic sparse linear Markov decision process (SMDP) where the state-transition distribution is a linear function of observed features. The only previously known algorithm for SMDP requires the knowledge of the sparsity parameter and oracle access to an unknown policy. We overcome these limitations by combining the doubly robust method that allows one to use feature vectors of \emph{all} actions with a novel analysis technique that enables the algorithm to use data from all periods in all episodes. The regret of the proposed algorithm is $\tilde{O}(σ^{-1}_{\min} s_{\star} H \sqrt{N})$, where $σ_{\min}$ denotes the restrictive the minimum eigenvalue of the average Gram matrix of feature vectors, $s_\star$ is the sparsity parameter, $H$ is the length of an episode, and $N$ is the number of rounds. We provide a lower regret bound that matches the upper bound up to logarithmic factors on a newly identified subclass of SMDPs. Our numerical experiments support our theoretical results and demonstrate the superior performance of our algorithm. △ Less

Submitted 23 October, 2023; originally announced October 2023.

arXiv:2308.02709 [pdf, other]

Scalable Computation of Causal Bounds

Authors: Madhumitha Shridharan, Garud Iyengar

Abstract: We consider the problem of computing bounds for causal queries on causal graphs with unobserved confounders and discrete valued observed variables, where identifiability does not hold. Existing non-parametric approaches for computing such bounds use linear programming (LP) formulations that quickly become intractable for existing solvers because the size of the LP grows exponentially in the number… ▽ More We consider the problem of computing bounds for causal queries on causal graphs with unobserved confounders and discrete valued observed variables, where identifiability does not hold. Existing non-parametric approaches for computing such bounds use linear programming (LP) formulations that quickly become intractable for existing solvers because the size of the LP grows exponentially in the number of edges in the causal graph. We show that this LP can be significantly pruned, allowing us to compute bounds for significantly larger causal inference problems compared to existing techniques. This pruning procedure allows us to compute bounds in closed form for a special class of problems, including a well-studied family of problems where multiple confounded treatments influence an outcome. We extend our pruning methodology to fractional LPs which compute bounds for causal queries which incorporate additional observations about the unit. We show that our methods provide significant runtime improvement compared to benchmarks in experiments and extend our results to the finite data setting. For causal inference without additional observations, we propose an efficient greedy heuristic that produces high quality bounds, and scales to problems that are several orders of magnitude larger than those for which the pruned LP can be solved. △ Less

Submitted 4 August, 2023; originally announced August 2023.

arXiv:2306.10081 [pdf, other]

Optimizer's Information Criterion: Dissecting and Correcting Bias in Data-Driven Optimization

Authors: Garud Iyengar, Henry Lam, Tianyu Wang

Abstract: In data-driven optimization, the sample performance of the obtained decision typically incurs an optimistic bias against the true performance, a phenomenon commonly known as the Optimizer's Curse and intimately related to overfitting in machine learning. Common techniques to correct this bias, such as cross-validation, require repeatedly solving additional optimization problems and are therefore c… ▽ More In data-driven optimization, the sample performance of the obtained decision typically incurs an optimistic bias against the true performance, a phenomenon commonly known as the Optimizer's Curse and intimately related to overfitting in machine learning. Common techniques to correct this bias, such as cross-validation, require repeatedly solving additional optimization problems and are therefore computationally expensive. We develop a general bias correction approach, building on what we call Optimizer's Information Criterion (OIC), that directly approximates the first-order bias and does not require solving any additional optimization problems. Our OIC generalizes the celebrated Akaike Information Criterion to evaluate the objective performance in data-driven optimization, which crucially involves not only model fitting but also its interplay with the downstream optimization. As such it can be used for decision selection instead of only model selection. We apply our approach to a range of data-driven optimization formulations comprising empirical and parametric models, their regularized counterparts, and furthermore contextual optimization. Finally, we provide numerical validation on the superior performance of our approach under synthetic and real-world datasets. △ Less

Submitted 16 October, 2023; v1 submitted 16 June, 2023; originally announced June 2023.

arXiv:2306.00096 [pdf, other]

Learning the Pareto Front Using Bootstrapped Observation Samples

Authors: Wonyoung Kim, Garud Iyengar, Assaf Zeevi

Abstract: We consider Pareto front identification (PFI) for linear bandits (PFILin), i.e., the goal is to identify a set of arms with undominated mean reward vectors when the mean reward vector is a linear function of the context. PFILin includes the best arm identification problem and multi-objective active learning as special cases. The sample complexity of our proposed algorithm is optimal up to a logari… ▽ More We consider Pareto front identification (PFI) for linear bandits (PFILin), i.e., the goal is to identify a set of arms with undominated mean reward vectors when the mean reward vector is a linear function of the context. PFILin includes the best arm identification problem and multi-objective active learning as special cases. The sample complexity of our proposed algorithm is optimal up to a logarithmic factor. In addition, the regret incurred by our algorithm during the estimation is within a logarithmic factor of the optimal regret among all algorithms that identify the Pareto front. Our key contribution is a new estimator that in every round updates the estimate for the unknown parameter along multiple context directions -- in contrast to the conventional estimator that only updates the parameter estimate along the chosen context. This allows us to use low-regret arms to collect information about Pareto optimal arms. Our key innovation is to reuse the exploration samples multiple times; in contrast to conventional estimators that use each sample only once. Numerical experiments demonstrate that the proposed algorithm successfully identifies the Pareto front while controlling the regret. △ Less

Submitted 22 May, 2024; v1 submitted 31 May, 2023; originally announced June 2023.

Comments: 37 pages including appendix

arXiv:2301.13791 [pdf, other]

Improved Algorithms for Multi-period Multi-class Packing Problems with Bandit Feedback

Authors: Wonyoung Kim, Garud Iyengar, Assaf Zeevi

Abstract: We consider the linear contextual multi-class multi-period packing problem (LMMP) where the goal is to pack items such that the total vector of consumption is below a given budget vector and the total value is as large as possible. We consider the setting where the reward and the consumption vector associated with each action is a class-dependent linear function of the context, and the decision-ma… ▽ More We consider the linear contextual multi-class multi-period packing problem (LMMP) where the goal is to pack items such that the total vector of consumption is below a given budget vector and the total value is as large as possible. We consider the setting where the reward and the consumption vector associated with each action is a class-dependent linear function of the context, and the decision-maker receives bandit feedback. LMMP includes linear contextual bandits with knapsacks and online revenue management as special cases. We establish a new estimator which guarantees a faster convergence rate, and consequently, a lower regret in such problems. We propose a bandit policy that is a closed-form function of said estimated parameters. When the contexts are non-degenerate, the regret of the proposed policy is sublinear in the context dimension, the number of classes, and the time horizon $T$ when the budget grows at least as $\sqrt{T}$. We also resolve an open problem posed by Agrawal & Devanur (2016) and extend the result to a multi-class setting. Our numerical experiments clearly demonstrate that the performance of our policy is superior to other benchmarks in the literature. △ Less

Submitted 31 May, 2023; v1 submitted 31 January, 2023; originally announced January 2023.

Comments: Accepted in ICML 2023, 44 pages including Appendix

arXiv:2212.01518 [pdf, other]

Hedging Complexity in Generalization via a Parametric Distributionally Robust Optimization Framework

Authors: Garud Iyengar, Henry Lam, Tianyu Wang

Abstract: Empirical risk minimization (ERM) and distributionally robust optimization (DRO) are popular approaches for solving stochastic optimization problems that appear in operations management and machine learning. Existing generalization error bounds for these methods depend on either the complexity of the cost function or dimension of the random perturbations. Consequently, the performance of these met… ▽ More Empirical risk minimization (ERM) and distributionally robust optimization (DRO) are popular approaches for solving stochastic optimization problems that appear in operations management and machine learning. Existing generalization error bounds for these methods depend on either the complexity of the cost function or dimension of the random perturbations. Consequently, the performance of these methods can be poor for high-dimensional problems with complex objective functions. We propose a simple approach in which the distribution of random perturbations is approximated using a parametric family of distributions. This mitigates both sources of complexity; however, it introduces a model misspecification error. We show that this new source of error can be controlled by suitable DRO formulations. Our proposed parametric DRO approach has significantly improved generalization bounds over existing ERM and DRO methods and parametric ERM for a wide variety of settings. Our method is particularly effective under distribution shifts and works broadly in contextual optimization. We also illustrate the superior performance of our approach on both synthetic and real-data portfolio optimization and regression tasks. △ Less

Submitted 24 September, 2023; v1 submitted 2 December, 2022; originally announced December 2022.

Comments: Preliminary version appeared in AISTATS 2023

arXiv:2206.05134 [pdf, other]

Distributionally Robust End-to-End Portfolio Construction

Authors: Giorgio Costa, Garud N. Iyengar

Abstract: We propose an end-to-end distributionally robust system for portfolio construction that integrates the asset return prediction model with a distributionally robust portfolio optimization model. We also show how to learn the risk-tolerance parameter and the degree of robustness directly from data. End-to-end systems have an advantage in that information can be communicated between the prediction an… ▽ More We propose an end-to-end distributionally robust system for portfolio construction that integrates the asset return prediction model with a distributionally robust portfolio optimization model. We also show how to learn the risk-tolerance parameter and the degree of robustness directly from data. End-to-end systems have an advantage in that information can be communicated between the prediction and decision layers during training, allowing the parameters to be trained for the final task rather than solely for predictive performance. However, existing end-to-end systems are not able to quantify and correct for the impact of model risk on the decision layer. Our proposed distributionally robust end-to-end portfolio selection system explicitly accounts for the impact of model risk. The decision layer chooses portfolios by solving a minimax problem where the distribution of the asset returns is assumed to belong to an ambiguity set centered around a nominal distribution. Using convex duality, we recast the minimax problem in a form that allows for efficient training of the end-to-end system. △ Less

Submitted 10 June, 2022; originally announced June 2022.

arXiv:2103.13929 [pdf, other]

Multinomial Logit Contextual Bandits: Provable Optimality and Practicality

Authors: Min-hwan Oh, Garud Iyengar

Abstract: We consider a sequential assortment selection problem where the user choice is given by a multinomial logit (MNL) choice model whose parameters are unknown. In each period, the learning agent observes a $d$-dimensional contextual information about the user and the $N$ available items, and offers an assortment of size $K$ to the user, and observes the bandit feedback of the item chosen from the ass… ▽ More We consider a sequential assortment selection problem where the user choice is given by a multinomial logit (MNL) choice model whose parameters are unknown. In each period, the learning agent observes a $d$-dimensional contextual information about the user and the $N$ available items, and offers an assortment of size $K$ to the user, and observes the bandit feedback of the item chosen from the assortment. We propose upper confidence bound based algorithms for this MNL contextual bandit. The first algorithm is a simple and practical method which achieves an $\tilde{\mathcal{O}}(d\sqrt{T})$ regret over $T$ rounds. Next, we propose a second algorithm which achieves a $\tilde{\mathcal{O}}(\sqrt{dT})$ regret. This matches the lower bound for the MNL bandit problem, up to logarithmic terms, and improves on the best known result by a $\sqrt{d}$ factor. To establish this sharper regret bound, we present a non-asymptotic confidence bound for the maximum likelihood estimator of the MNL model that may be of independent interest as its own theoretical contribution. We then revisit the simpler, significantly more practical, first algorithm and show that a simple variant of the algorithm achieves the optimal regret for a broad class of important applications. △ Less

Submitted 25 March, 2021; originally announced March 2021.

Comments: Accepted in AAAI 2021 (Main Technical Track)

arXiv:2010.03983 [pdf, ps, other]

Online Allocation of Reusable Resources via Algorithms Guided by Fluid Approximations

Authors: Vineet Goyal, Garud Iyengar, Rajan Udwani

Abstract: We consider the problem of online allocation (matching and assortments) of reusable resources where customers arrive sequentially in an adversarial fashion and allocated resources are used or rented for a stochastic duration that is drawn independently from known distributions. Focusing on the case of large inventory, we give an algorithm that is $(1-1/e)$ competitive for general usage distributio… ▽ More We consider the problem of online allocation (matching and assortments) of reusable resources where customers arrive sequentially in an adversarial fashion and allocated resources are used or rented for a stochastic duration that is drawn independently from known distributions. Focusing on the case of large inventory, we give an algorithm that is $(1-1/e)$ competitive for general usage distributions. At the heart of our result is the notion of a relaxed online algorithm that is only subjected to fluid approximations of the stochastic elements in the problem. The output of this algorithm serves as a guide for the final algorithm. This leads to a principled approach for seamlessly addressing stochastic elements (such as reusability, customer choice, and combinations thereof) in online resource allocation problems, that may be useful more broadly. △ Less

Submitted 8 October, 2020; originally announced October 2020.

Comments: arXiv admin note: text overlap with arXiv:2002.02430

arXiv:2007.08477 [pdf, other]

Sparsity-Agnostic Lasso Bandit

Authors: Min-hwan Oh, Garud Iyengar, Assaf Zeevi

Abstract: We consider a stochastic contextual bandit problem where the dimension $d$ of the feature vectors is potentially large, however, only a sparse subset of features of cardinality $s_0 \ll d$ affect the reward function. Essentially all existing algorithms for sparse bandits require a priori knowledge of the value of the sparsity index $s_0$. This knowledge is almost never available in practice, and m… ▽ More We consider a stochastic contextual bandit problem where the dimension $d$ of the feature vectors is potentially large, however, only a sparse subset of features of cardinality $s_0 \ll d$ affect the reward function. Essentially all existing algorithms for sparse bandits require a priori knowledge of the value of the sparsity index $s_0$. This knowledge is almost never available in practice, and misspecification of this parameter can lead to severe deterioration in the performance of existing methods. The main contribution of this paper is to propose an algorithm that does not require prior knowledge of the sparsity index $s_0$ and establish tight regret bounds on its performance under mild conditions. We also comprehensively evaluate our proposed algorithm numerically and show that it consistently outperforms existing methods, even when the correct sparsity index is revealed to them but is kept hidden from our algorithm. △ Less

Submitted 28 April, 2021; v1 submitted 16 July, 2020; originally announced July 2020.

arXiv:2004.10398 [pdf, other]

Sequential Anomaly Detection using Inverse Reinforcement Learning

Authors: Min-hwan Oh, Garud Iyengar

Abstract: One of the most interesting application scenarios in anomaly detection is when sequential data are targeted. For example, in a safety-critical environment, it is crucial to have an automatic detection system to screen the streaming data gathered by monitoring sensors and to report abnormal observations if detected in real-time. Oftentimes, stakes are much higher when these potential anomalies are… ▽ More One of the most interesting application scenarios in anomaly detection is when sequential data are targeted. For example, in a safety-critical environment, it is crucial to have an automatic detection system to screen the streaming data gathered by monitoring sensors and to report abnormal observations if detected in real-time. Oftentimes, stakes are much higher when these potential anomalies are intentional or goal-oriented. We propose an end-to-end framework for sequential anomaly detection using inverse reinforcement learning (IRL), whose objective is to determine the decision-making agent's underlying function which triggers his/her behavior. The proposed method takes the sequence of actions of a target agent (and possibly other meta information) as input. The agent's normal behavior is then understood by the reward function which is inferred via IRL. We use a neural network to represent a reward function. Using a learned reward function, we evaluate whether a new observation from the target agent follows a normal pattern. In order to construct a reliable anomaly detection method and take into consideration the confidence of the predicted anomaly score, we adopt a Bayesian approach for IRL. The empirical study on publicly available real-world data shows that our proposed method is effective in identifying anomalies. △ Less

Submitted 22 April, 2020; originally announced April 2020.

Comments: Published in KDD 2019 (Oral in Research Paper Track)

arXiv:2002.02430 [pdf, other]

Asymptotically Optimal Competitive Ratio for Online Allocation of Reusable Resources

Authors: Vineet Goyal, Garud Iyengar, Rajan Udwani

Abstract: We consider the problem of online allocation (matching, budgeted allocations, and assortments) of reusable resources where an adversarial sequence of resource requests is revealed over time and allocated resources are used/rented for a stochastic duration, drawn independently from known resource usage distributions. This problem is a fundamental generalization of well studied models in online matc… ▽ More We consider the problem of online allocation (matching, budgeted allocations, and assortments) of reusable resources where an adversarial sequence of resource requests is revealed over time and allocated resources are used/rented for a stochastic duration, drawn independently from known resource usage distributions. This problem is a fundamental generalization of well studied models in online matching and resource allocation. We give an algorithm that obtains the best possible competitive ratio of $(1-1/e)$ for general usage distributions and large resource capacities. At the heart of our algorithm is a new quantity that factors in the potential of reusability for each resource by (computationally) creating an asymmetry between identical units of the resource. In order to control the stochastic dependencies induced by reusability, we introduce a relaxed online algorithm that is only subject to fluid approximations of the stochastic elements in the problem. The output of this relaxed algorithm guides the overall algorithm. Finally, we establish competitive ratio guarantees by constructing a feasible solution to an LP free system of constraints. More generally, these ideas lead to a principled approach for integrating stochastic and combinatorial elements (such as reusability, customer choice, and budgeted allocations) in online resource allocation problems. △ Less

Submitted 18 July, 2022; v1 submitted 6 February, 2020; originally announced February 2020.

Comments: 1-pg abstract in WINE 2021. This version combines results from previous iteration (arXiv:2002.02430v3) and the short note arXiv:2010.03983

arXiv:1808.10552 [pdf, other]

Directed Exploration in PAC Model-Free Reinforcement Learning

Authors: Min-hwan Oh, Garud Iyengar

Abstract: We study an exploration method for model-free RL that generalizes the counter-based exploration bonus methods and takes into account long term exploratory value of actions rather than a single step look-ahead. We propose a model-free RL method that modifies Delayed Q-learning and utilizes the long-term exploration bonus with provable efficiency. We show that our proposed method finds a near-optima… ▽ More We study an exploration method for model-free RL that generalizes the counter-based exploration bonus methods and takes into account long term exploratory value of actions rather than a single step look-ahead. We propose a model-free RL method that modifies Delayed Q-learning and utilizes the long-term exploration bonus with provable efficiency. We show that our proposed method finds a near-optimal policy in polynomial time (PAC-MDP), and also provide experimental evidence that our proposed algorithm is an efficient exploration method. △ Less

Submitted 30 August, 2018; originally announced August 2018.

arXiv:1808.05988 [pdf, other]

Attainment Ratings for Graph-Query Recommendation

Authors: Hal Cooper, Garud Iyengar, Ching-Yung Lin

Abstract: The video game industry is larger than both the film and music industries combined. Recommender systems for video games have received relatively scant academic attention, despite the uniqueness of the medium and its data. In this paper, we introduce a graph-based recommender system that makes use of interactivity, arguably the most significant feature of video gaming. We show that the use of impli… ▽ More The video game industry is larger than both the film and music industries combined. Recommender systems for video games have received relatively scant academic attention, despite the uniqueness of the medium and its data. In this paper, we introduce a graph-based recommender system that makes use of interactivity, arguably the most significant feature of video gaming. We show that the use of implicit data that tracks user-game interactions and levels of attainment (e.g. Sony Playstation Trophies, Microsoft Xbox Achievements) has high predictive value when making recommendations. Furthermore, we argue that the characteristics of the video gaming hobby (low cost, high duration, socially relevant) make clear the necessity of personalized, individual recommendations that can incorporate social networking information. We demonstrate the natural suitability of graph-query based recommendation for this purpose. △ Less

Submitted 17 August, 2018; originally announced August 2018.

arXiv:1808.02433 [pdf, other]

Robust Implicit Backpropagation

Authors: Francois Fagan, Garud Iyengar

Abstract: Arguably the biggest challenge in applying neural networks is tuning the hyperparameters, in particular the learning rate. The sensitivity to the learning rate is due to the reliance on backpropagation to train the network. In this paper we present the first application of Implicit Stochastic Gradient Descent (ISGD) to train neural networks, a method known in convex optimization to be unconditiona… ▽ More Arguably the biggest challenge in applying neural networks is tuning the hyperparameters, in particular the learning rate. The sensitivity to the learning rate is due to the reliance on backpropagation to train the network. In this paper we present the first application of Implicit Stochastic Gradient Descent (ISGD) to train neural networks, a method known in convex optimization to be unconditionally stable and robust to the learning rate. Our key contribution is a novel layer-wise approximation of ISGD which makes its updates tractable for neural networks. Experiments show that our method is more robust to high learning rates and generally outperforms standard backpropagation on a variety of tasks. △ Less

Submitted 7 August, 2018; originally announced August 2018.

arXiv:1806.01384 [pdf, other]

Passive Static Equilibrium with Frictional Contacts and Application to Grasp Stability Analysis

Authors: Maximilian Haas-Heger, Christos Papadimitriou, Mihalis Yannakakis, Garud Iyengar, Matei Ciocarlie

Abstract: This paper studies the problem of passive grasp stability under an external disturbance, that is, the ability of a grasp to resist a disturbance through passive responses at the contacts. To obtain physically consistent results, such a model must account for friction phenomena at each contact; the difficulty is that friction forces depend in non-linear fashion on contact behavior (stick or slip).… ▽ More This paper studies the problem of passive grasp stability under an external disturbance, that is, the ability of a grasp to resist a disturbance through passive responses at the contacts. To obtain physically consistent results, such a model must account for friction phenomena at each contact; the difficulty is that friction forces depend in non-linear fashion on contact behavior (stick or slip). We develop the first polynomial-time algorithm which either solves such complex equilibrium constraints for two-dimensional grasps, or otherwise concludes that no solution exists. To achieve this, we show that the number of possible `slip states' (where each contact is labeled as either sticking or slip**) that must be considered is polynomial (in fact quadratic) in the number of contacts, and not exponential as previously thought. Our algorithm captures passive response behaviors at each contact, while accounting for constraints on friction forces such as the maximum dissipation principle. △ Less

Submitted 13 June, 2018; v1 submitted 4 June, 2018; originally announced June 2018.

Comments: Robotics: Science and Systems June 26-30, 2018 (9 pages, 7 figures)

arXiv:1803.08577 [pdf, other]

Unbiased scalable softmax optimization

Authors: Francois Fagan, Garud Iyengar

Abstract: Recent neural network and language models rely on softmax distributions with an extremely large number of categories. Since calculating the softmax normalizing constant in this context is prohibitively expensive, there is a growing literature of efficiently computable but biased estimates of the softmax. In this paper we propose the first unbiased algorithms for maximizing the softmax likelihood w… ▽ More Recent neural network and language models rely on softmax distributions with an extremely large number of categories. Since calculating the softmax normalizing constant in this context is prohibitively expensive, there is a growing literature of efficiently computable but biased estimates of the softmax. In this paper we propose the first unbiased algorithms for maximizing the softmax likelihood whose work per iteration is independent of the number of classes and datapoints (and no extra work is required at the end of each epoch). We show that our proposed unbiased methods comprehensively outperform the state-of-the-art on seven real world datasets. △ Less

Submitted 22 March, 2018; originally announced March 2018.

arXiv:1801.06558 [pdf, other]

Passive Reaction Analysis for Grasp Stability

Authors: Maximilian Haas-Heger, Garud Iyengar, Matei Ciocarlie

Abstract: In this paper we focus on the following problem in multi-fingered robotic gras**: assuming that an external wrench is being applied to a grasped object, will the contact forces between the hand and the object, as well as the hand joints, respond in such a way as to preserve quasi-static equilibrium? In particular, we assume that there is no change in the joint torques being actively exerted by t… ▽ More In this paper we focus on the following problem in multi-fingered robotic gras**: assuming that an external wrench is being applied to a grasped object, will the contact forces between the hand and the object, as well as the hand joints, respond in such a way as to preserve quasi-static equilibrium? In particular, we assume that there is no change in the joint torques being actively exerted by the motors; any change in contact forces and joint torques is due exclusively to passive effects arising in response to the external disturbance. Such passive effects include for example joints that are driven by highly geared motors (a common occurence in practice) and thus do not back drive in response to external torques. To account for non- linear phenomena encountered in such cases, and which existing methods do not consider, we formulate the problem as a mixed integer program used in the inner loop of an iterative solver. We present evidence showing that this formulation captures important effects for assessing the stability of a grasp employing some of the most commonly used actuation mechanisms. △ Less

Submitted 19 January, 2018; originally announced January 2018.

Comments: In press for IEEE Transactions on Automation Science and Engineering Special Issue 12 pages, 9 figures, 1 table

arXiv:1611.01558 [pdf, other]

Social influence makes self-interested crowds smarter: an optimal control perspective

Authors: Yu Luo, Garud Iyengar, Venkat Venkatasubramanian

Abstract: It is very common to observe crowds of individuals solving similar problems with similar information in a largely independent manner. We argue here that crowds can become "smarter," i.e., more efficient and robust, by partially following the average opinion. This observation runs counter to the widely accepted claim that the wisdom of crowds deteriorates with social influence. The key difference i… ▽ More It is very common to observe crowds of individuals solving similar problems with similar information in a largely independent manner. We argue here that crowds can become "smarter," i.e., more efficient and robust, by partially following the average opinion. This observation runs counter to the widely accepted claim that the wisdom of crowds deteriorates with social influence. The key difference is that individuals are self-interested and hence will reject feedbacks that do not improve their performance. We propose a control-theoretic methodology to compute the degree of social influence, i.e., the level to which one accepts the population feedback, that optimizes performance. We conducted an experiment with human subjects ($N = 194$), where the participants were first asked to solve an optimization problem independently, i.e., under no social influence. Our theoretical methodology estimates a $30\%$ degree of social influence to be optimal, resulting in a $29\%$ improvement in the crowd's performance. We then let the same cohort solve a new problem and have access to the average opinion. Surprisingly, we find the average degree of social influence in the cohort to be $32\%$ with a $29\%$ improvement in performance: In other words, the crowd self-organized into a near-optimal setting. We believe this new paradigm for making crowds "smarter" has the potential for making a significant impact on a diverse set of fields including population health to government planning. We include a case study to show how a crowd of states can collectively learn the level of taxation and expenditure that optimizes economic growth. △ Less

Submitted 4 November, 2016; originally announced November 2016.

Comments: Venkat Venkatasubramanian is the corresponding author. Email: [email protected]

MSC Class: 93C95 (Primary); 93B52; 93C05; 93C55; 15A18; 65N22; 65F10; 65F15; 65F35 (Secondary)

arXiv:1110.0685 [pdf, ps, other]

doi 10.1016/j.ejor.2018.02.059

Energy Aware Scheduling for Weighted Completion Time and Weighted Tardiness

Authors: Rodrigo A. Carrasco, Garud Iyengar, Cliff Stein

Abstract: The ever increasing adoption of mobile devices with limited energy storage capacity, on the one hand, and more awareness of the environmental impact of massive data centres and server pools, on the other hand, have both led to an increased interest in energy management algorithms. The main contribution of this paper is to present several new constant factor approximation algorithms for energy aw… ▽ More The ever increasing adoption of mobile devices with limited energy storage capacity, on the one hand, and more awareness of the environmental impact of massive data centres and server pools, on the other hand, have both led to an increased interest in energy management algorithms. The main contribution of this paper is to present several new constant factor approximation algorithms for energy aware scheduling problems where the objective is to minimize weighted completion time plus the cost of the energy consumed, in the one machine non-preemptive setting, while allowing release dates and deadlines.Unlike previous known algorithms these new algorithms can handle general job-dependent energy cost functions, extending the application of these algorithms to settings outside the typical CPU-energy one. These new settings include problems where in addition, or instead, of energy costs we also have maintenance costs, wear and tear, replacement costs, etc., which in general depend on the speed at which the machine runs but also depend on the types of jobs processed. Our algorithms also extend to approximating weighted tardiness plus energy cost, an inherently more difficult problem that has not been addressed in the literature. △ Less

Submitted 4 October, 2011; originally announced October 2011.

Comments: 17 pages

arXiv:cs/0612065 [pdf, ps, other]

An equilibrium model for matching impatient demand and patient supply over time

Authors: Garud Iyengar, Anuj Kumar

Abstract: We present a simple dynamic equilibrium model for an online exchange where both buyers and sellers arrive according to a exogenously defined stochastic process. The structure of this exchange is motivated by the limit order book mechanism used in stock markets. Both buyers and sellers are elastic in the price-quantity space; however, only the sellers are assumed to be patient, i.e. only the sell… ▽ More We present a simple dynamic equilibrium model for an online exchange where both buyers and sellers arrive according to a exogenously defined stochastic process. The structure of this exchange is motivated by the limit order book mechanism used in stock markets. Both buyers and sellers are elastic in the price-quantity space; however, only the sellers are assumed to be patient, i.e. only the sellers have a price - time elasticity, whereas the buyers are assumed to be impatient. Sellers select their selling price as a best response to all the other sellers' strategies. We define and establish the existence of the equilibrium in this model and show how to numerically compute this equilibrium. We also show how to compute other relevant quantities such as the equilibrium expected time to sale and equilibrium expected order density, as well as the expected order density conditioned on current selling price. We derive a closed form for the equilibrium distribution when the demand is price independent. At this equilibrium the selling (limit order) price distribution is power tailed as is empirically observed in order driven financial markets. △ Less

Submitted 28 March, 2007; v1 submitted 12 December, 2006; originally announced December 2006.

Comments: 15 pages, 4 figures

ACM Class: J.4; G.3

arXiv:cs/0611063 [pdf, ps, other]

Characterizing Optimal Adword Auctions

Authors: Garud Iyengar, Anuj Kumar

Abstract: We present a number of models for the adword auctions used for pricing advertising slots on search engines such as Google, Yahoo! etc. We begin with a general problem formulation which allows the privately known valuation per click to be a function of both the identity of the advertiser and the slot. We present a compact characterization of the set of all deterministic incentive compatible direc… ▽ More We present a number of models for the adword auctions used for pricing advertising slots on search engines such as Google, Yahoo! etc. We begin with a general problem formulation which allows the privately known valuation per click to be a function of both the identity of the advertiser and the slot. We present a compact characterization of the set of all deterministic incentive compatible direct mechanisms for this model. This new characterization allows us to conclude that there are incentive compatible mechanisms for this auction with a multi-dimensional type-space that are {\em not} affine maximizers. Next, we discuss two interesting special cases: slot independent valuation and slot independent valuation up to a privately known slot and zero thereafter. For both of these special cases, we characterize revenue maximizing and efficiency maximizing mechanisms and show that these mechanisms can be computed with a worst case computational complexity $O(n^2m^2)$ and $O(n^2m^3)$ respectively, where $n$ is number of bidders and $m$ is number of slots. Next, we characterize optimal rank based allocation rules and propose a new mechanism that we call the customized rank based allocation. We report the results of a numerical study that compare the revenue and efficiency of the proposed mechanisms. The numerical results suggest that customized rank-based allocation rule is significantly superior to the rank-based allocation rules. △ Less

Submitted 15 November, 2006; originally announced November 2006.

Comments: 29 pages, work was presented at a) Second Workshop on Sponsored Search Auctions, Ann Arbor, MI b) INFORMS Annual Meeting, Pittsburgh c) Decision Sciences Seminar, Fuqua School of Business, Duke University

Report number: CORC Technical Report TR-2006-04 at Computational Optimization Research Center at Columbia University

Showing 1–25 of 25 results for author: Iyengar, G