-
Learning from Few Samples: Transformation-Invariant SVMs with Composition and Locality at Multiple Scales
Authors:
Tao Liu,
P. R. Kumar,
Ruida Zhou,
Xi Liu
Abstract:
Motivated by the problem of learning with small sample sizes, this paper shows how to incorporate into support-vector machines (SVMs) those properties that have made convolutional neural networks (CNNs) successful. Particularly important is the ability to incorporate domain knowledge of invariances, e.g., translational invariance of images. Kernels based on the \textit{maximum} similarity over a g…
▽ More
Motivated by the problem of learning with small sample sizes, this paper shows how to incorporate into support-vector machines (SVMs) those properties that have made convolutional neural networks (CNNs) successful. Particularly important is the ability to incorporate domain knowledge of invariances, e.g., translational invariance of images. Kernels based on the \textit{maximum} similarity over a group of transformations are not generally positive definite. Perhaps it is for this reason that they have not been studied theoretically. We address this lacuna and show that positive definiteness indeed holds \textit{with high probability} for kernels based on the maximum similarity in the small training sample set regime of interest, and that they do yield the best results in that regime. We also show how additional properties such as their ability to incorporate local features at multiple spatial scales, e.g., as done in CNNs through max pooling, and to provide the benefits of composition through the architecture of multiple layers, can also be embedded into SVMs. We verify through experiments on widely available image sets that the resulting SVMs do provide superior accuracy in comparison to well-established deep neural network benchmarks for small sample sizes.
△ Less
Submitted 22 October, 2022; v1 submitted 27 September, 2021;
originally announced September 2021.
-
Reward-Biased Maximum Likelihood Estimation for Linear Stochastic Bandits
Authors:
Yu-Heng Hung,
**-Chun Hsieh,
Xi Liu,
P. R. Kumar
Abstract:
Modifying the reward-biased maximum likelihood method originally proposed in the adaptive control literature, we propose novel learning algorithms to handle the explore-exploit trade-off in linear bandits problems as well as generalized linear bandits problems. We develop novel index policies that we prove achieve order-optimality, and show that they achieve empirical performance competitive with…
▽ More
Modifying the reward-biased maximum likelihood method originally proposed in the adaptive control literature, we propose novel learning algorithms to handle the explore-exploit trade-off in linear bandits problems as well as generalized linear bandits problems. We develop novel index policies that we prove achieve order-optimality, and show that they achieve empirical performance competitive with the state-of-the-art benchmark methods in extensive experiments. The new policies achieve this with low computation time per pull for linear bandits, and thereby resulting in both favorable regret as well as computational efficiency.
△ Less
Submitted 8 October, 2020;
originally announced October 2020.
-
Learning in Networked Control Systems
Authors:
Rahul Singh,
P. R. Kumar
Abstract:
We design adaptive controller (learning rule) for a networked control system (NCS) in which data packets containing control information are transmitted across a lossy wireless channel. We propose Upper Confidence Bounds for Networked Control Systems (UCB-NCS), a learning rule that maintains confidence intervals for the estimates of plant parameters $(A_{(\star)},B_{(\star)})$, and channel reliabil…
▽ More
We design adaptive controller (learning rule) for a networked control system (NCS) in which data packets containing control information are transmitted across a lossy wireless channel. We propose Upper Confidence Bounds for Networked Control Systems (UCB-NCS), a learning rule that maintains confidence intervals for the estimates of plant parameters $(A_{(\star)},B_{(\star)})$, and channel reliability $p_{(\star)}$, and utilizes the principle of optimism in the face of uncertainty while making control decisions. We provide non-asymptotic performance guarantees for UCB-NCS by analyzing its "regret", i.e., performance gap from the scenario when $(A_{(\star)},B_{(\star)},p_{(\star)})$ are known to the controller. We show that with a high probability the regret can be upper-bounded as $\tilde{O}\left(C\sqrt{T}\right)$\footnote{Here $\tilde{O}$ hides logarithmic factors.}, where $T$ is the operating time horizon of the system, and $C$ is a problem dependent constant.
△ Less
Submitted 21 March, 2020;
originally announced March 2020.
-
Exploration Through Reward Biasing: Reward-Biased Maximum Likelihood Estimation for Stochastic Multi-Armed Bandits
Authors:
Xi Liu,
**-Chun Hsieh,
Anirban Bhattacharya,
P. R. Kumar
Abstract:
Inspired by the Reward-Biased Maximum Likelihood Estimate method of adaptive control, we propose RBMLE -- a novel family of learning algorithms for stochastic multi-armed bandits (SMABs). For a broad range of SMABs including both the parametric Exponential Family as well as the non-parametric sub-Gaussian/Exponential family, we show that RBMLE yields an index policy. To choose the bias-growth rate…
▽ More
Inspired by the Reward-Biased Maximum Likelihood Estimate method of adaptive control, we propose RBMLE -- a novel family of learning algorithms for stochastic multi-armed bandits (SMABs). For a broad range of SMABs including both the parametric Exponential Family as well as the non-parametric sub-Gaussian/Exponential family, we show that RBMLE yields an index policy. To choose the bias-growth rate $α(t)$ in RBMLE, we reveal the nontrivial interplay between $α(t)$ and the regret bound that generally applies in both the Exponential Family as well as the sub-Gaussian/Exponential family bandits. To quantify the finite-time performance, we prove that RBMLE attains order-optimality by adaptively estimating the unknown constants in the expression of $α(t)$ for Gaussian and sub-Gaussian bandits. Extensive experiments demonstrate that the proposed RBMLE achieves empirical regret performance competitive with the state-of-the-art methods, while being more computationally efficient and scalable in comparison to the best-performing ones among them.
△ Less
Submitted 23 October, 2020; v1 submitted 2 July, 2019;
originally announced July 2019.
-
Stay With Me: Lifetime Maximization Through Heteroscedastic Linear Bandits With Reneging
Authors:
**-Chun Hsieh,
Xi Liu,
Anirban Bhattacharya,
P. R. Kumar
Abstract:
Sequential decision making for lifetime maximization is a critical problem in many real-world applications, such as medical treatment and portfolio selection. In these applications, a "reneging" phenomenon, where participants may disengage from future interactions after observing an unsatisfiable outcome, is rather prevalent. To address the above issue, this paper proposes a model of heteroscedast…
▽ More
Sequential decision making for lifetime maximization is a critical problem in many real-world applications, such as medical treatment and portfolio selection. In these applications, a "reneging" phenomenon, where participants may disengage from future interactions after observing an unsatisfiable outcome, is rather prevalent. To address the above issue, this paper proposes a model of heteroscedastic linear bandits with reneging, which allows each participant to have a distinct "satisfaction level," with any interaction outcome falling short of that level resulting in that participant reneging. Moreover, it allows the variance of the outcome to be context-dependent. Based on this model, we develop a UCB-type policy, namely HR-UCB, and prove that it achieves $\mathcal{O}\big(\sqrt{{T}(\log({T}))^{3}}\big)$ regret. Finally, we validate the performance of HR-UCB via simulations.
△ Less
Submitted 15 May, 2019; v1 submitted 29 October, 2018;
originally announced October 2018.
-
Dynamic Modeling of Price Responsive Demand in Real-time Electricity Market: Empirical Analysis
Authors:
Jaeyong An,
P. R. Kumar,
Le Xie
Abstract:
In this paper, we study the price responsiveness of electricity consumption from empirical commercial and industrial load data obtained from Texas. Employing a dynamical system perspective, we show that price responsive demand can be modeled as a hybrid of a Hammerstein model with delay following a price surge, and a linear ARX model under moderate price changes. It is observed that electricity co…
▽ More
In this paper, we study the price responsiveness of electricity consumption from empirical commercial and industrial load data obtained from Texas. Employing a dynamical system perspective, we show that price responsive demand can be modeled as a hybrid of a Hammerstein model with delay following a price surge, and a linear ARX model under moderate price changes. It is observed that electricity consumption therefore has unique characteristics including (1) qualitatively distinct response between moderate and extremely high prices; and (2) a time delay associated with the response to high prices. It is shown that these observed features may render traditional approaches to demand response and retail pricing based on classical economic theories ineffective. In particular, ultimate real-time retail pricing may be limitedly beneficial than as considered in classical economic theories.
△ Less
Submitted 15 December, 2016;
originally announced December 2016.