Search | arXiv e-print repository

Learning from Few Samples: Transformation-Invariant SVMs with Composition and Locality at Multiple Scales

Authors: Tao Liu, P. R. Kumar, Ruida Zhou, Xi Liu

Abstract: Motivated by the problem of learning with small sample sizes, this paper shows how to incorporate into support-vector machines (SVMs) those properties that have made convolutional neural networks (CNNs) successful. Particularly important is the ability to incorporate domain knowledge of invariances, e.g., translational invariance of images. Kernels based on the \textit{maximum} similarity over a g… ▽ More Motivated by the problem of learning with small sample sizes, this paper shows how to incorporate into support-vector machines (SVMs) those properties that have made convolutional neural networks (CNNs) successful. Particularly important is the ability to incorporate domain knowledge of invariances, e.g., translational invariance of images. Kernels based on the \textit{maximum} similarity over a group of transformations are not generally positive definite. Perhaps it is for this reason that they have not been studied theoretically. We address this lacuna and show that positive definiteness indeed holds \textit{with high probability} for kernels based on the maximum similarity in the small training sample set regime of interest, and that they do yield the best results in that regime. We also show how additional properties such as their ability to incorporate local features at multiple spatial scales, e.g., as done in CNNs through max pooling, and to provide the benefits of composition through the architecture of multiple layers, can also be embedded into SVMs. We verify through experiments on widely available image sets that the resulting SVMs do provide superior accuracy in comparison to well-established deep neural network benchmarks for small sample sizes. △ Less

Submitted 22 October, 2022; v1 submitted 27 September, 2021; originally announced September 2021.

Comments: Will appear in NeurIPS 2022

arXiv:2010.04091 [pdf, ps, other]

Reward-Biased Maximum Likelihood Estimation for Linear Stochastic Bandits

Authors: Yu-Heng Hung, **-Chun Hsieh, Xi Liu, P. R. Kumar

Abstract: Modifying the reward-biased maximum likelihood method originally proposed in the adaptive control literature, we propose novel learning algorithms to handle the explore-exploit trade-off in linear bandits problems as well as generalized linear bandits problems. We develop novel index policies that we prove achieve order-optimality, and show that they achieve empirical performance competitive with… ▽ More Modifying the reward-biased maximum likelihood method originally proposed in the adaptive control literature, we propose novel learning algorithms to handle the explore-exploit trade-off in linear bandits problems as well as generalized linear bandits problems. We develop novel index policies that we prove achieve order-optimality, and show that they achieve empirical performance competitive with the state-of-the-art benchmark methods in extensive experiments. The new policies achieve this with low computation time per pull for linear bandits, and thereby resulting in both favorable regret as well as computational efficiency. △ Less

Submitted 8 October, 2020; originally announced October 2020.

arXiv:2003.09596 [pdf, ps, other]

Learning in Networked Control Systems

Authors: Rahul Singh, P. R. Kumar

Abstract: We design adaptive controller (learning rule) for a networked control system (NCS) in which data packets containing control information are transmitted across a lossy wireless channel. We propose Upper Confidence Bounds for Networked Control Systems (UCB-NCS), a learning rule that maintains confidence intervals for the estimates of plant parameters $(A_{(\star)},B_{(\star)})$, and channel reliabil… ▽ More We design adaptive controller (learning rule) for a networked control system (NCS) in which data packets containing control information are transmitted across a lossy wireless channel. We propose Upper Confidence Bounds for Networked Control Systems (UCB-NCS), a learning rule that maintains confidence intervals for the estimates of plant parameters $(A_{(\star)},B_{(\star)})$, and channel reliability $p_{(\star)}$, and utilizes the principle of optimism in the face of uncertainty while making control decisions. We provide non-asymptotic performance guarantees for UCB-NCS by analyzing its "regret", i.e., performance gap from the scenario when $(A_{(\star)},B_{(\star)},p_{(\star)})$ are known to the controller. We show that with a high probability the regret can be upper-bounded as $\tilde{O}\left(C\sqrt{T}\right)$\footnote{Here $\tilde{O}$ hides logarithmic factors.}, where $T$ is the operating time horizon of the system, and $C$ is a problem dependent constant. △ Less

Submitted 21 March, 2020; originally announced March 2020.

Comments: Submitted to CDC and LCSS (http://ieee-cssletters.dei.unipd.it/index.php)

arXiv:1907.01287 [pdf, ps, other]

Exploration Through Reward Biasing: Reward-Biased Maximum Likelihood Estimation for Stochastic Multi-Armed Bandits

Authors: Xi Liu, **-Chun Hsieh, Anirban Bhattacharya, P. R. Kumar

Abstract: Inspired by the Reward-Biased Maximum Likelihood Estimate method of adaptive control, we propose RBMLE -- a novel family of learning algorithms for stochastic multi-armed bandits (SMABs). For a broad range of SMABs including both the parametric Exponential Family as well as the non-parametric sub-Gaussian/Exponential family, we show that RBMLE yields an index policy. To choose the bias-growth rate… ▽ More Inspired by the Reward-Biased Maximum Likelihood Estimate method of adaptive control, we propose RBMLE -- a novel family of learning algorithms for stochastic multi-armed bandits (SMABs). For a broad range of SMABs including both the parametric Exponential Family as well as the non-parametric sub-Gaussian/Exponential family, we show that RBMLE yields an index policy. To choose the bias-growth rate $α(t)$ in RBMLE, we reveal the nontrivial interplay between $α(t)$ and the regret bound that generally applies in both the Exponential Family as well as the sub-Gaussian/Exponential family bandits. To quantify the finite-time performance, we prove that RBMLE attains order-optimality by adaptively estimating the unknown constants in the expression of $α(t)$ for Gaussian and sub-Gaussian bandits. Extensive experiments demonstrate that the proposed RBMLE achieves empirical regret performance competitive with the state-of-the-art methods, while being more computationally efficient and scalable in comparison to the best-performing ones among them. △ Less

Submitted 23 October, 2020; v1 submitted 2 July, 2019; originally announced July 2019.

Comments: ICML 2020

arXiv:1810.12418 [pdf, ps, other]

Stay With Me: Lifetime Maximization Through Heteroscedastic Linear Bandits With Reneging

Authors: **-Chun Hsieh, Xi Liu, Anirban Bhattacharya, P. R. Kumar

Abstract: Sequential decision making for lifetime maximization is a critical problem in many real-world applications, such as medical treatment and portfolio selection. In these applications, a "reneging" phenomenon, where participants may disengage from future interactions after observing an unsatisfiable outcome, is rather prevalent. To address the above issue, this paper proposes a model of heteroscedast… ▽ More Sequential decision making for lifetime maximization is a critical problem in many real-world applications, such as medical treatment and portfolio selection. In these applications, a "reneging" phenomenon, where participants may disengage from future interactions after observing an unsatisfiable outcome, is rather prevalent. To address the above issue, this paper proposes a model of heteroscedastic linear bandits with reneging, which allows each participant to have a distinct "satisfaction level," with any interaction outcome falling short of that level resulting in that participant reneging. Moreover, it allows the variance of the outcome to be context-dependent. Based on this model, we develop a UCB-type policy, namely HR-UCB, and prove that it achieves $\mathcal{O}\big(\sqrt{{T}(\log({T}))^{3}}\big)$ regret. Finally, we validate the performance of HR-UCB via simulations. △ Less

Submitted 15 May, 2019; v1 submitted 29 October, 2018; originally announced October 2018.

Comments: To appear in ICML 2019. More rounds of experiments are performed before being taken average of compared to versions before

arXiv:1612.05021 [pdf, other]

Dynamic Modeling of Price Responsive Demand in Real-time Electricity Market: Empirical Analysis

Authors: Jaeyong An, P. R. Kumar, Le Xie

Abstract: In this paper, we study the price responsiveness of electricity consumption from empirical commercial and industrial load data obtained from Texas. Employing a dynamical system perspective, we show that price responsive demand can be modeled as a hybrid of a Hammerstein model with delay following a price surge, and a linear ARX model under moderate price changes. It is observed that electricity co… ▽ More In this paper, we study the price responsiveness of electricity consumption from empirical commercial and industrial load data obtained from Texas. Employing a dynamical system perspective, we show that price responsive demand can be modeled as a hybrid of a Hammerstein model with delay following a price surge, and a linear ARX model under moderate price changes. It is observed that electricity consumption therefore has unique characteristics including (1) qualitatively distinct response between moderate and extremely high prices; and (2) a time delay associated with the response to high prices. It is shown that these observed features may render traditional approaches to demand response and retail pricing based on classical economic theories ineffective. In particular, ultimate real-time retail pricing may be limitedly beneficial than as considered in classical economic theories. △ Less

Submitted 15 December, 2016; originally announced December 2016.

Showing 1–6 of 6 results for author: Kumar, P R