-
A General Control-Theoretic Approach for Reinforcement Learning: Theory and Algorithms
Authors:
Weiqin Chen,
Mark S. Squillante,
Chai Wah Wu,
Santiago Paternain
Abstract:
We devise a control-theoretic reinforcement learning approach to support direct learning of the optimal policy. We establish theoretical properties of our approach and derive an algorithm based on a specific instance of this approach. Our empirical results demonstrate the significant benefits of our approach.
We devise a control-theoretic reinforcement learning approach to support direct learning of the optimal policy. We establish theoretical properties of our approach and derive an algorithm based on a specific instance of this approach. Our empirical results demonstrate the significant benefits of our approach.
△ Less
Submitted 20 June, 2024;
originally announced June 2024.
-
On Quantum Algorithms for Efficient Solutions of General Classes of Structured Markov Processes
Authors:
Vasileios Kalantzis,
Mark S. Squillante,
Shashanka Ubaru
Abstract:
We study the fundamental problem of efficiently computing the stationary distribution of general classes of structured Markov processes. In strong contrast with previous work, we consider this problem within the context of quantum computational environments from a mathematical perspective and devise the first quantum algorithms for computing the stationary distribution of structured Markov process…
▽ More
We study the fundamental problem of efficiently computing the stationary distribution of general classes of structured Markov processes. In strong contrast with previous work, we consider this problem within the context of quantum computational environments from a mathematical perspective and devise the first quantum algorithms for computing the stationary distribution of structured Markov processes. We derive a mathematical analysis of the computational properties of our quantum algorithms together with related theoretical results, establishing that our quantum algorithms provide the potential for significant computational improvements over that of the best-known classical algorithms in various settings of both theoretical and practical importance. Although motivated by structured Markov processes, our quantum algorithms have the potential for being exploited to address a much larger class of numerical computation problems.
△ Less
Submitted 27 April, 2024;
originally announced April 2024.
-
Multi-Function Multi-Way Analog Technology for Sustainable Machine Intelligence Computation
Authors:
Vassilis Kalantzis,
Mark S. Squillante,
Shashanka Ubaru,
Tayfun Gokmen,
Chai Wah Wu,
Anshul Gupta,
Haim Avron,
Tomasz Nowicki,
Malte Rasch,
Murat Onen,
Vanessa Lopez Marrero,
Effendi Leobandung,
Yasuteru Kohda,
Wilfried Haensch,
Lior Horesh
Abstract:
Numerical computation is essential to many areas of artificial intelligence (AI), whose computing demands continue to grow dramatically, yet their continued scaling is jeopardized by the slowdown in Moore's law. Multi-function multi-way analog (MFMWA) technology, a computing architecture comprising arrays of memristors supporting in-memory computation of matrix operations, can offer tremendous imp…
▽ More
Numerical computation is essential to many areas of artificial intelligence (AI), whose computing demands continue to grow dramatically, yet their continued scaling is jeopardized by the slowdown in Moore's law. Multi-function multi-way analog (MFMWA) technology, a computing architecture comprising arrays of memristors supporting in-memory computation of matrix operations, can offer tremendous improvements in computation and energy, but at the expense of inherent unpredictability and noise. We devise novel randomized algorithms tailored to MFMWA architectures that mitigate the detrimental impact of imperfect analog computations while realizing their potential benefits across various areas of AI, such as applications in computer vision. Through analysis, measurements from analog devices, and simulations of larger systems, we demonstrate orders of magnitude reduction in both computation and energy with accuracy similar to digital computers.
△ Less
Submitted 24 January, 2024;
originally announced January 2024.
-
Obtaining Explainable Classification Models using Distributionally Robust Optimization
Authors:
Sanjeeb Dash,
Soumyadip Ghosh,
Joao Goncalves,
Mark S. Squillante
Abstract:
Model explainability is crucial for human users to be able to interpret how a proposed classifier assigns labels to data based on its feature values. We study generalized linear models constructed using sets of feature value rules, which can capture nonlinear dependencies and interactions. An inherent trade-off exists between rule set sparsity and its prediction accuracy. It is computationally exp…
▽ More
Model explainability is crucial for human users to be able to interpret how a proposed classifier assigns labels to data based on its feature values. We study generalized linear models constructed using sets of feature value rules, which can capture nonlinear dependencies and interactions. An inherent trade-off exists between rule set sparsity and its prediction accuracy. It is computationally expensive to find the right choice of sparsity -- e.g., via cross-validation -- with existing methods. We propose a new formulation to learn an ensemble of rule sets that simultaneously addresses these competing factors. Good generalization is ensured while kee** computational costs low by utilizing distributionally robust optimization. The formulation utilizes column generation to efficiently search the space of rule sets and constructs a sparse ensemble of rule sets, in contrast with techniques like random forests or boosting and their variants. We present theoretical results that motivate and justify the use of our distributionally robust formulation. Extensive numerical experiments establish that our method improves over competing methods -- on a large set of publicly available binary classification problem instances -- with respect to one or more of the following metrics: generalization quality, computational cost, and explainability.
△ Less
Submitted 3 November, 2023;
originally announced November 2023.
-
Stable iterative refinement algorithms for solving linear systems
Authors:
Chai Wah Wu,
Mark S. Squillante,
Vasileios Kalantzis,
Lior Horesh
Abstract:
Iterative refinement (IR) is a popular scheme for solving a linear system of equations based on gradually improving the accuracy of an initial approximation. Originally developed to improve upon the accuracy of Gaussian elimination, interest in IR has been revived because of its suitability for execution on fast low-precision hardware such as analog devices and graphics processing units. IR genera…
▽ More
Iterative refinement (IR) is a popular scheme for solving a linear system of equations based on gradually improving the accuracy of an initial approximation. Originally developed to improve upon the accuracy of Gaussian elimination, interest in IR has been revived because of its suitability for execution on fast low-precision hardware such as analog devices and graphics processing units. IR generally converges when the error associated with the solution method is small, but is known to diverge when this error is large. We propose and analyze a novel enhancement to the IR algorithm by adding a line search optimization step that guarantees the algorithm will not diverge. Numerical experiments verify our theoretical results and illustrate the effectiveness of our proposed scheme.
△ Less
Submitted 14 September, 2023;
originally announced September 2023.
-
Generalization Performance of Transfer Learning: Overparameterized and Underparameterized Regimes
Authors:
Peizhong Ju,
Sen Lin,
Mark S. Squillante,
Yingbin Liang,
Ness B. Shroff
Abstract:
Transfer learning is a useful technique for achieving improved performance and reducing training costs by leveraging the knowledge gained from source tasks and applying it to target tasks. Assessing the effectiveness of transfer learning relies on understanding the similarity between the ground truth of the source and target tasks. In real-world applications, tasks often exhibit partial similarity…
▽ More
Transfer learning is a useful technique for achieving improved performance and reducing training costs by leveraging the knowledge gained from source tasks and applying it to target tasks. Assessing the effectiveness of transfer learning relies on understanding the similarity between the ground truth of the source and target tasks. In real-world applications, tasks often exhibit partial similarity, where certain aspects are similar while others are different or irrelevant. To investigate the impact of partial similarity on transfer learning performance, we focus on a linear regression model with two distinct sets of features: a common part shared across tasks and a task-specific part. Our study explores various types of transfer learning, encompassing two options for parameter transfer. By establishing a theoretical characterization on the error of the learned model, we compare these transfer learning options, particularly examining how generalization performance changes with the number of features/parameters in both underparameterized and overparameterized regimes. Furthermore, we provide practical guidelines for determining the number of features in the common and task-specific parts for improved generalization performance. For example, when the total number of features in the source task's learning model is fixed, we show that it is more advantageous to allocate a greater number of redundant features to the task-specific part rather than the common part. Moreover, in specific scenarios, particularly those characterized by high noise levels and small true parameters, sacrificing certain true features in the common part in favor of employing more redundant features in the task-specific part can yield notable benefits.
△ Less
Submitted 8 June, 2023; v1 submitted 7 June, 2023;
originally announced June 2023.
-
Towards a Unification of Logic and Information Theory
Authors:
Luis A. Lastras,
Barry Trager,
Jonathan Lenchner,
Wojtek Szpankowski,
Chai Wah Wu,
Mark Squillante,
Alex Gray
Abstract:
This article introduces a theory of communication that covers the following generic scenario: Alice knows more than Bob about a certain set of logic propositions and Alice and Bob wish to communicate as efficiently as possible with the shared goal that, following their communication, Bob should be able to deduce a particular logic proposition that Alice knows to be true.
We assume that our logic…
▽ More
This article introduces a theory of communication that covers the following generic scenario: Alice knows more than Bob about a certain set of logic propositions and Alice and Bob wish to communicate as efficiently as possible with the shared goal that, following their communication, Bob should be able to deduce a particular logic proposition that Alice knows to be true.
We assume that our logic system is propositional logic, and we build on top of one of the legendary works in this area, namely the work of Carnap and Bar-Hillel on a theory of semantic information. Our main contribution is a collection of theorems studying various different assumptions on what Alice and Bob know and what their goal is. These theorems all provide sharp upper and lower bounds phrased in terms of an entropy-like function that we call $Λ$, in reference to its apparent connection to problems of communication involving logic. It turns out that when the goal is to communicate only a portion of the knowledge that Alice possesses, the optimum communication cost is lower than most people seem to assume, yet unavoidably, such optimum communication strategies end up allowing Bob to prove even more things than originally intended. Another interesting outcome is that in some scenarios, Alice need not know the logic statements that Bob knows in order to attain asymptotically the same communication efficiency as if she knew the statement, in a nod to the famous Slepian-Wolf and Wyner-Ziv results from source coding theory. Our work also introduces practical codes, which are comprised of a combination of linear codes and enumerative source codes, which turn out to be asymptotically optimal for some scenarios.
△ Less
Submitted 16 April, 2024; v1 submitted 25 January, 2023;
originally announced January 2023.
-
Topological data analysis on noisy quantum computers
Authors:
Ismail Yunus Akhalwaya,
Shashanka Ubaru,
Kenneth L. Clarkson,
Mark S. Squillante,
Vishnu Jejjala,
Yang-Hui He,
Kugendran Naidoo,
Vasileios Kalantzis,
Lior Horesh
Abstract:
Topological data analysis (TDA) is a powerful technique for extracting complex and valuable shape-related summaries of high-dimensional data. However, the computational demands of classical algorithms for computing TDA are exorbitant, and quickly become impractical for high-order characteristics. Quantum computers offer the potential of achieving significant speedup for certain computational probl…
▽ More
Topological data analysis (TDA) is a powerful technique for extracting complex and valuable shape-related summaries of high-dimensional data. However, the computational demands of classical algorithms for computing TDA are exorbitant, and quickly become impractical for high-order characteristics. Quantum computers offer the potential of achieving significant speedup for certain computational problems. Indeed, TDA has been purported to be one such problem, yet, quantum computing algorithms proposed for the problem, such as the original Quantum TDA (QTDA) formulation by Lloyd, Garnerone and Zanardi, require fault-tolerance qualifications that are currently unavailable. In this study, we present NISQ-TDA, a fully implemented end-to-end quantum machine learning algorithm needing only a short circuit-depth, that is applicable to high-dimensional classical data, and with provable asymptotic speedup for certain classes of problems. The algorithm neither suffers from the data-loading problem nor does it need to store the input data on the quantum computer explicitly. The algorithm was successfully executed on quantum computing devices, as well as on noisy quantum simulators, applied to small datasets. Preliminary empirical results suggest that the algorithm is robust to noise.
△ Less
Submitted 19 March, 2024; v1 submitted 19 September, 2022;
originally announced September 2022.
-
A Class of Geometric Structures in Transfer Learning: Minimax Bounds and Optimality
Authors:
Xuhui Zhang,
Jose Blanchet,
Soumyadip Ghosh,
Mark S. Squillante
Abstract:
We study the problem of transfer learning, observing that previous efforts to understand its information-theoretic limits do not fully exploit the geometric structure of the source and target domains. In contrast, our study first illustrates the benefits of incorporating a natural geometric structure within a linear regression model, which corresponds to the generalized eigenvalue problem formed b…
▽ More
We study the problem of transfer learning, observing that previous efforts to understand its information-theoretic limits do not fully exploit the geometric structure of the source and target domains. In contrast, our study first illustrates the benefits of incorporating a natural geometric structure within a linear regression model, which corresponds to the generalized eigenvalue problem formed by the Gram matrices of both domains. We next establish a finite-sample minimax lower bound, propose a refined model interpolation estimator that enjoys a matching upper bound, and then extend our framework to multiple source domains and generalized linear models. Surprisingly, as long as information is available on the distance between the source and target parameters, negative-transfer does not occur. Simulation studies show that our proposed interpolation estimator outperforms state-of-the-art transfer learning methods in both moderate- and high-dimensional settings.
△ Less
Submitted 23 February, 2022;
originally announced February 2022.
-
Quantum Topological Data Analysis with Linear Depth and Exponential Speedup
Authors:
Shashanka Ubaru,
Ismail Yunus Akhalwaya,
Mark S. Squillante,
Kenneth L. Clarkson,
Lior Horesh
Abstract:
Quantum computing offers the potential of exponential speedups for certain classical computations. Over the last decade, many quantum machine learning (QML) algorithms have been proposed as candidates for such exponential improvements. However, two issues unravel the hope of exponential speedup for some of these QML algorithms: the data-loading problem and, more recently, the stunning dequantizati…
▽ More
Quantum computing offers the potential of exponential speedups for certain classical computations. Over the last decade, many quantum machine learning (QML) algorithms have been proposed as candidates for such exponential improvements. However, two issues unravel the hope of exponential speedup for some of these QML algorithms: the data-loading problem and, more recently, the stunning dequantization results of Tang et al. A third issue, namely the fault-tolerance requirements of most QML algorithms, has further hindered their practical realization. The quantum topological data analysis (QTDA) algorithm of Lloyd, Garnerone and Zanardi was one of the first QML algorithms that convincingly offered an expected exponential speedup. From the outset, it did not suffer from the data-loading problem. A recent result has also shown that the generalized problem solved by this algorithm is likely classically intractable, and would therefore be immune to any dequantization efforts. However, the QTDA algorithm of Lloyd et~al. has a time complexity of $O(n^4/(ε^2 δ))$ (where $n$ is the number of data points, $ε$ is the error tolerance, and $δ$ is the smallest nonzero eigenvalue of the restricted Laplacian) and requires fault-tolerant quantum computing, which has not yet been achieved. In this paper, we completely overhaul the QTDA algorithm to achieve an improved exponential speedup and depth complexity of $O(n\log(1/(δε)))$. Our approach includes three key innovations: (a) an efficient realization of the combinatorial Laplacian as a sum of Pauli operators; (b) a quantum rejection sampling approach to restrict the superposition to the simplices in the complex; and (c) a stochastic rank estimation method to estimate the Betti numbers. We present a theoretical error analysis, and the circuit and computational time and depth complexities for Betti number estimation.
△ Less
Submitted 5 August, 2021;
originally announced August 2021.
-
Solving sparse linear systems with approximate inverse preconditioners on analog devices
Authors:
Vasileios Kalantzis,
Anshul Gupta,
Lior Horesh,
Tomasz Nowicki,
Mark S. Squillante,
Chai Wah Wu
Abstract:
Sparse linear system solvers are computationally expensive kernels that lie at the heart of numerous applications. This paper proposes a flexible preconditioning framework to substantially reduce the time and energy requirements of this task by utilizing a hybrid architecture that combines conventional digital microprocessors with analog crossbar array accelerators. Our analysis and experiments wi…
▽ More
Sparse linear system solvers are computationally expensive kernels that lie at the heart of numerous applications. This paper proposes a flexible preconditioning framework to substantially reduce the time and energy requirements of this task by utilizing a hybrid architecture that combines conventional digital microprocessors with analog crossbar array accelerators. Our analysis and experiments with a simulator for analog hardware demonstrate that an order of magnitude speedup is readily attainable without much impact on convergence, despite the noise in analog computations.
△ Less
Submitted 14 July, 2021;
originally announced July 2021.
-
Unbiased Gradient Estimation for Distributionally Robust Learning
Authors:
Soumyadip Ghosh,
Mark Squillante
Abstract:
Seeking to improve model generalization, we consider a new approach based on distributionally robust learning (DRL) that applies stochastic gradient descent to the outer minimization problem. Our algorithm efficiently estimates the gradient of the inner maximization problem through multi-level Monte Carlo randomization. Leveraging theoretical results that shed light on why standard gradient estima…
▽ More
Seeking to improve model generalization, we consider a new approach based on distributionally robust learning (DRL) that applies stochastic gradient descent to the outer minimization problem. Our algorithm efficiently estimates the gradient of the inner maximization problem through multi-level Monte Carlo randomization. Leveraging theoretical results that shed light on why standard gradient estimators fail, we establish the optimal parameterization of the gradient estimators of our approach that balances a fundamental tradeoff between computation time and statistical variance. Numerical experiments demonstrate that our DRL approach yields significant benefits over previous work.
△ Less
Submitted 22 December, 2020;
originally announced December 2020.
-
Optimal Scheduling Control in Fluid Models of General $n\times n$ Input-Queued Switches
Authors:
Yingdong Lu,
Mark S. Squillante,
Tonghoon Suk
Abstract:
Most of the early input-queued switch research focused on establishing throughput optimality of the max-weight scheduling policy, with some recent research showing that max-weight scheduling is optimal with respect to total expected delay asymptotically in the heavy-traffic regime. However, the question of delay-optimal scheduling in input-queued switches remains open in general, as does the quest…
▽ More
Most of the early input-queued switch research focused on establishing throughput optimality of the max-weight scheduling policy, with some recent research showing that max-weight scheduling is optimal with respect to total expected delay asymptotically in the heavy-traffic regime. However, the question of delay-optimal scheduling in input-queued switches remains open in general, as does the question of delay-optimal scheduling under more general objective functions. To gain fundamental insights into these very difficult problems, we consider a fluid model of $n \times n$ input-queued switches with associated fluid-flow costs, and we derive an optimal scheduling control policy to an infinite horizon discounted control problem with a general linear objective function of fluid cost. Our optimal policy coincides with the $cμ$-rule in certain parameter domains. More generally, due to the input-queued switch constraints, the optimal policy takes the form of the solution to a flow maximization problem, after we identify the Lagrangian multipliers of some key constraints through carefully designed algorithms. Computational experiments demonstrate the benefits of our optimal scheduling policy over variants of max-weight scheduling within fluid models of input-queued switches.
△ Less
Submitted 12 October, 2020; v1 submitted 30 October, 2019;
originally announced October 2019.
-
On Heavy-Traffic Optimal Scaling of $c$-Weighted MaxWeight Scheduling in Input-Queued Switches
Authors:
Yingdong Lu,
Siva Theja Maguluri,
Mark S. Squillante,
Tonghoon Suk
Abstract:
We consider the optimal control of input-queued switches under a cost-weighted variant of MaxWeight scheduling, for which we establish theoretical properties that include showing the algorithm exhibits optimal heavy-traffic queue-length scaling. Our results are expected to be of theoretical interest more broadly than input-queued switches.
We consider the optimal control of input-queued switches under a cost-weighted variant of MaxWeight scheduling, for which we establish theoretical properties that include showing the algorithm exhibits optimal heavy-traffic queue-length scaling. Our results are expected to be of theoretical interest more broadly than input-queued switches.
△ Less
Submitted 2 March, 2020; v1 submitted 3 July, 2019;
originally announced July 2019.
-
A General Markov Decision Process Framework for Directly Learning Optimal Control Policies
Authors:
Yingdong Lu,
Mark S. Squillante,
Chai Wah Wu
Abstract:
We consider a new form of reinforcement learning (RL) that is based on opportunities to directly learn the optimal control policy and a general Markov decision process (MDP) framework devised to support these opportunities. Derivations of general classes of our control-based RL methods are presented, together with forms of exploration and exploitation in learning and applying the optimal control p…
▽ More
We consider a new form of reinforcement learning (RL) that is based on opportunities to directly learn the optimal control policy and a general Markov decision process (MDP) framework devised to support these opportunities. Derivations of general classes of our control-based RL methods are presented, together with forms of exploration and exploitation in learning and applying the optimal control policy over time. Our general MDP framework extends the classical Bellman operator and optimality criteria by generalizing the definition and scope of a policy for any given state. We establish the convergence and optimality-both in general and within various control paradigms (e.g., piecewise linear control policies)-of our control-based methods through this general MDP framework, including convergence of $Q$-learning within the context of our MDP framework. Our empirical results demonstrate and quantify the significant benefits of our approach.
△ Less
Submitted 31 March, 2021; v1 submitted 28 May, 2019;
originally announced May 2019.
-
Optimising capacity allocation in networks of stochastic loss systems: A functional-form approach
Authors:
Brendan Patch,
Mark S. Squillante,
Peter M. Van de Ven
Abstract:
Motivated by a wide variety of applications, this paper introduces a general class of networks of stochastic loss systems in which congestion renders lost revenue due to customers or jobs being permanently removed from the system. We seek to balance the trade-off between mitigating congestion by increasing service capacity and maintaining low costs for the service capacity provided. Given the lack…
▽ More
Motivated by a wide variety of applications, this paper introduces a general class of networks of stochastic loss systems in which congestion renders lost revenue due to customers or jobs being permanently removed from the system. We seek to balance the trade-off between mitigating congestion by increasing service capacity and maintaining low costs for the service capacity provided. Given the lack of analytical results and the computational burden of simulation-based methods, we propose a hybrid functional-form approach for finding the optimal resource allocation in general networks of stochastic loss systems that combines the speed of an analytical approach with the accuracy of simulation-based optimisation. The key insight is a core iterative algorithm that replaces the computationally expensive gradient estimation in simulation optimisation with a closed-form analytical approximation that is calibrated using a simple simulation run. Extensive computational experiments on complex networks show that our approach renders near-optimal solutions with objective function values that are comparable to those obtained using stochastic approximation, surrogate optimisation and Bayesian optimisation methods while requiring significantly less computational effort.
△ Less
Submitted 11 May, 2022; v1 submitted 10 April, 2019;
originally announced April 2019.
-
PROVEN: Certifying Robustness of Neural Networks with a Probabilistic Approach
Authors:
Tsui-Wei Weng,
Pin-Yu Chen,
Lam M. Nguyen,
Mark S. Squillante,
Ivan Oseledets,
Luca Daniel
Abstract:
With deep neural networks providing state-of-the-art machine learning models for numerous machine learning tasks, quantifying the robustness of these models has become an important area of research. However, most of the research literature merely focuses on the \textit{worst-case} setting where the input of the neural network is perturbed with noises that are constrained within an $\ell_p$ ball; a…
▽ More
With deep neural networks providing state-of-the-art machine learning models for numerous machine learning tasks, quantifying the robustness of these models has become an important area of research. However, most of the research literature merely focuses on the \textit{worst-case} setting where the input of the neural network is perturbed with noises that are constrained within an $\ell_p$ ball; and several algorithms have been proposed to compute certified lower bounds of minimum adversarial distortion based on such worst-case analysis. In this paper, we address these limitations and extend the approach to a \textit{probabilistic} setting where the additive noises can follow a given distributional characterization. We propose a novel probabilistic framework PROVEN to PRObabilistically VErify Neural networks with statistical guarantees -- i.e., PROVEN certifies the probability that the classifier's top-1 prediction cannot be altered under any constrained $\ell_p$ norm perturbation to a given input. Importantly, we show that it is possible to derive closed-form probabilistic certificates based on current state-of-the-art neural network robustness verification frameworks. Hence, the probabilistic certificates provided by PROVEN come naturally and with almost no overhead when obtaining the worst-case certified lower bounds from existing methods such as Fast-Lin, CROWN and CNN-Cert. Experiments on small and large MNIST and CIFAR neural network models demonstrate our probabilistic approach can achieve up to around $75\%$ improvement in the robustness certification with at least a $99.99\%$ confidence compared with the worst-case robustness certificate delivered by CROWN.
△ Less
Submitted 7 January, 2019; v1 submitted 18 December, 2018;
originally announced December 2018.
-
Efficient Stochastic Gradient Descent for Learning with Distributionally Robust Optimization
Authors:
Soumyadip Ghosh,
Mark Squillante,
Ebisa Wollega
Abstract:
Distributionally robust optimization (DRO) problems are increasingly seen as a viable method to train machine learning models for improved model generalization. These min-max formulations, however, are more difficult to solve. We therefore provide a new stochastic gradient descent algorithm to efficiently solve this DRO formulation. Our approach applies gradient descent to the outer minimization f…
▽ More
Distributionally robust optimization (DRO) problems are increasingly seen as a viable method to train machine learning models for improved model generalization. These min-max formulations, however, are more difficult to solve. We therefore provide a new stochastic gradient descent algorithm to efficiently solve this DRO formulation. Our approach applies gradient descent to the outer minimization formulation and estimates the gradient of the inner maximization based on a sample average approximation. The latter uses a subset of the data in each iteration, progressively increasing the subset size to ensure convergence. Theoretical results include establishing the optimal manner for growing the support size to balance a fundamental tradeoff between stochastic error and computational effort. Empirical results demonstrate the significant benefits of our approach over previous work, and also illustrate how learning with DRO can improve generalization.
△ Less
Submitted 2 November, 2020; v1 submitted 22 May, 2018;
originally announced May 2018.
-
A General Family of Robust Stochastic Operators for Reinforcement Learning
Authors:
Yingdong Lu,
Mark S. Squillante,
Chai Wah Wu
Abstract:
We consider a new family of operators for reinforcement learning with the goal of alleviating the negative effects and becoming more robust to approximation or estimation errors. Various theoretical results are established, which include showing on a sample path basis that our family of operators preserve optimality and increase the action gap. Our empirical results illustrate the strong benefits…
▽ More
We consider a new family of operators for reinforcement learning with the goal of alleviating the negative effects and becoming more robust to approximation or estimation errors. Various theoretical results are established, which include showing on a sample path basis that our family of operators preserve optimality and increase the action gap. Our empirical results illustrate the strong benefits of our family of operators, significantly outperforming the classical Bellman operator and recently proposed operators.
△ Less
Submitted 28 May, 2019; v1 submitted 21 May, 2018;
originally announced May 2018.
-
Bounded-Velocity Stochastic Control for Dynamic Resource Allocation
Authors:
Xuefeng Gao,
Yingdong Lu,
Mayank Sharma,
Mark S. Squillante,
Joost W. Bosman
Abstract:
We consider a general class of dynamic resource allocation problems within a stochastic optimal control framework. This class of problems arises in a wide variety of applications, each of which intrinsically involves resources of different types and demand with uncertainty and/or variability. The goal involves dynamically allocating capacity for every resource type in order to serve the uncertain/…
▽ More
We consider a general class of dynamic resource allocation problems within a stochastic optimal control framework. This class of problems arises in a wide variety of applications, each of which intrinsically involves resources of different types and demand with uncertainty and/or variability. The goal involves dynamically allocating capacity for every resource type in order to serve the uncertain/variable demand, modeled as Brownian motion, and maximize the discounted expected net-benefit over an infinite time horizon based on the rewards and costs associated with the different resource types, subject to flexibility constraints on the rate of change of each type of resource capacity. We derive the optimal control policy within a bounded-velocity stochastic control setting, which includes efficient and easily implementable algorithms for governing the dynamic adjustments to resource allocation capacities over time. Computational experiments investigate various issues of both theoretical and practical interest, quantifying the benefits of our approach over recent alternative optimization approaches.
△ Less
Submitted 3 January, 2018;
originally announced January 2018.
-
Control of Time-Varying Epidemic-Like Stochastic Processes and Their Mean-Field Limits
Authors:
Yingdong Lu,
Mark S. Squillante,
Chai Wah Wu
Abstract:
The optimal control of epidemic-like stochastic processes is important both historically and for emerging applications today, where it can be especially important to include time-varying parameters that impact viral epidemic-like propagation. We connect the control of such stochastic processes with time-varying behavior to the stochastic shortest path problem and obtain solutions for various cost…
▽ More
The optimal control of epidemic-like stochastic processes is important both historically and for emerging applications today, where it can be especially important to include time-varying parameters that impact viral epidemic-like propagation. We connect the control of such stochastic processes with time-varying behavior to the stochastic shortest path problem and obtain solutions for various cost functions. Then, under a mean-field scaling, this general class of stochastic processes is shown to converge to a corresponding dynamical system. We analogously establish that the optimal control of this class of processes converges to the optimal control of the limiting dynamical system. Consequently, we study the optimal control of the dynamical system where the comparison of both controlled systems renders various important mathematical properties of interest.
△ Less
Submitted 27 September, 2017;
originally announced September 2017.
-
On the Control of Density-Dependent Stochastic Population Processes with Time-Varying Behavior
Authors:
Yingdong Lu,
Mark Squillante,
Chai Wah Wu
Abstract:
The study of density-dependent stochastic population processes is important from a historical perspective as well as from the perspective of a number of existing and emerging applications today. In more recent applications of these processes, it can be especially important to include time-varying parameters for the rates that impact the density-dependent population structures and behaviors. Under…
▽ More
The study of density-dependent stochastic population processes is important from a historical perspective as well as from the perspective of a number of existing and emerging applications today. In more recent applications of these processes, it can be especially important to include time-varying parameters for the rates that impact the density-dependent population structures and behaviors. Under a mean-field scaling, we show that such density-dependent stochastic population processes with time-varying behavior converge to a corresponding dynamical system. We analogously establish that the optimal control of such density-dependent stochastic population processes converges to the optimal control of the limiting dynamical system. An analysis of both the dynamical system and its optimal control renders various important mathematical properties of interest.
△ Less
Submitted 22 September, 2017;
originally announced September 2017.
-
Optimal Weighted-Delay Scheduling in $2\times 2$ Input-Queued Switches
Authors:
Yingdong Lu,
Siva Theja Maguluri,
Mark S. Squillante,
Tonghoon Suk,
Xiaofan Wu
Abstract:
Motivated by few delay-optimal scheduling results, in comparison to results on throughput optimality, we investigate a canonical input-queued switch scheduling problem in which the objective is to minimize the discounted delay cost over an infinite time horizon. We derive an optimal scheduling policy and establish corresponding theoretical properties, which are expected to be of interest more broa…
▽ More
Motivated by few delay-optimal scheduling results, in comparison to results on throughput optimality, we investigate a canonical input-queued switch scheduling problem in which the objective is to minimize the discounted delay cost over an infinite time horizon. We derive an optimal scheduling policy and establish corresponding theoretical properties, which are expected to be of interest more broadly than input-queued switches. Computational experiments demonstrate and quantify the benefits of our optimal scheduling policy over alternative policies such as variants of MaxWeight scheduling, well-known to be throughput optimal and more recently shown to be delay optimal in the heavy-traffic regime limit.
△ Less
Submitted 10 February, 2023; v1 submitted 7 April, 2017;
originally announced April 2017.
-
On Optimal Portfolios of Dynamic Resource Allocations
Authors:
Yingdong Lu,
Siva Theja Maguluri,
Mark S. Squillante,
Chai Wah Wu
Abstract:
We consider the optimal allocation of generic resources among multiple generic entities of interest over a finite planning horizon, where each entity generates stochastic returns as a function of its resource allocation during each period. The main objective is to maximize the expected return while at the same time managing risk to an acceptable level for each period. We devise a general solution…
▽ More
We consider the optimal allocation of generic resources among multiple generic entities of interest over a finite planning horizon, where each entity generates stochastic returns as a function of its resource allocation during each period. The main objective is to maximize the expected return while at the same time managing risk to an acceptable level for each period. We devise a general solution framework and establish how to obtain the optimal dynamic resource allocation.
△ Less
Submitted 24 February, 2017;
originally announced February 2017.
-
Matching Supply and Demand in Production-Inventory Systems: Asymptotics and Optimization
Authors:
Yingdong Lu,
Mark S. Squillante,
David D. Yao
Abstract:
We consider a general class of high-volume, fast-moving production-inventory systems based on both lost-sales and backorder inventory models. Such systems require a fundamental understanding of the asymptotic behavior of key performance measures under various supply strategies, as well as the pre-planning of these strategies. Our analysis relies on a thorough study of the asymptotic behavior of a…
▽ More
We consider a general class of high-volume, fast-moving production-inventory systems based on both lost-sales and backorder inventory models. Such systems require a fundamental understanding of the asymptotic behavior of key performance measures under various supply strategies, as well as the pre-planning of these strategies. Our analysis relies on a thorough study of the asymptotic behavior of a random walk with power drift, which is of independent interest. In addition to providing key insights, our analysis leads to approximations of the corresponding optimization problem that yield simple solutions which are close to optimal. We also establish an equivalence between the lost-sales and backorder models when both have the same penalty cost that becomes large.
△ Less
Submitted 28 January, 2015;
originally announced January 2015.
-
Asymptotic Optimality of Constant-Order Policies for Lost Sales Inventory Models with Large Lead Times
Authors:
David A. Goldberg,
Dmitriy A. Katz-Rogozhnikov,
Yingdong Lu,
Mayank Sharma,
Mark S. Squillante
Abstract:
Lost sales inventory models with large lead times, which arise in many practical settings, are notoriously difficult to optimize due to the curse of dimensionality. In this paper we show that when lead times are large, a very simple constant-order policy, first studied by Reiman (\cite{Reiman04}), performs nearly optimally. The main insight of our work is that when the lead time is very large, suc…
▽ More
Lost sales inventory models with large lead times, which arise in many practical settings, are notoriously difficult to optimize due to the curse of dimensionality. In this paper we show that when lead times are large, a very simple constant-order policy, first studied by Reiman (\cite{Reiman04}), performs nearly optimally. The main insight of our work is that when the lead time is very large, such a significant amount of randomness is injected into the system between when an order for more inventory is placed and when that order is received, that "being smart" algorithmically provides almost no benefit. Our main proof technique combines a novel coupling for suprema of random walks with arguments from queueing theory.
△ Less
Submitted 3 September, 2014; v1 submitted 16 November, 2012;
originally announced November 2012.