Skip to main content

Showing 1–20 of 20 results for author: Shroff, N

Searching in archive stat. Search in all archives.
.
  1. arXiv:2402.13901  [pdf, other

    cs.LG eess.SP stat.ML

    Non-asymptotic Convergence of Discrete-time Diffusion Models: New Approach and Improved Rate

    Authors: Yuchen Liang, Peizhong Ju, Yingbin Liang, Ness Shroff

    Abstract: The denoising diffusion model has recently emerged as a powerful generative technique that converts noise into data. While there are many studies providing theoretical guarantees for diffusion processes based on discretized stochastic differential equation (D-SDE), many generative samplers in real applications directly employ a discrete-time (DT) diffusion process. However, there are very few stud… ▽ More

    Submitted 30 May, 2024; v1 submitted 21 February, 2024; originally announced February 2024.

  2. arXiv:2310.02941  [pdf, ps, other

    stat.ML cs.LG math.PR

    Hoeffding's Inequality for Markov Chains under Generalized Concentrability Condition

    Authors: Hao Chen, Abhishek Gupta, Yin Sun, Ness Shroff

    Abstract: This paper studies Hoeffding's inequality for Markov chains under the generalized concentrability condition defined via integral probability metric (IPM). The generalized concentrability condition establishes a framework that interpolates and extends the existing hypotheses of Markov chain Hoeffding-type inequalities. The flexibility of our framework allows Hoeffding's inequality to be applied bey… ▽ More

    Submitted 4 October, 2023; originally announced October 2023.

  3. arXiv:2302.04374  [pdf, ps, other

    cs.LG stat.ML

    Near-Optimal Adversarial Reinforcement Learning with Switching Costs

    Authors: Ming Shi, Yingbin Liang, Ness Shroff

    Abstract: Switching costs, which capture the costs for changing policies, are regarded as a critical metric in reinforcement learning (RL), in addition to the standard metric of losses (or rewards). However, existing studies on switching costs (with a coefficient $β$ that is strictly positive and is independent of $T$) have mainly focused on static RL, where the loss distribution is assumed to be fixed duri… ▽ More

    Submitted 8 February, 2023; originally announced February 2023.

    Comments: Accepted by ICLR2023 as Top 25%

  4. arXiv:2206.11889  [pdf, other

    cs.LG cs.AI eess.SY math.OC stat.ML

    Provably Efficient Model-Free Constrained RL with Linear Function Approximation

    Authors: Arnob Ghosh, Xingyu Zhou, Ness Shroff

    Abstract: We study the constrained reinforcement learning problem, in which an agent aims to maximize the expected cumulative reward subject to a constraint on the expected total value of a utility function. In contrast to existing model-based approaches or model-free methods accompanied with a `simulator', we aim to develop the first model-free, simulator-free algorithm that achieves a sublinear regret and… ▽ More

    Submitted 6 January, 2023; v1 submitted 23 June, 2022; originally announced June 2022.

    Comments: Accepted and Published at the 36th Neural Information Processing Systems (NeurIPS'22). Section J (where different episodes may start from different states) is added in this version

  5. arXiv:2206.02047  [pdf, ps, other

    cs.LG math.ST stat.ML

    On the Generalization Power of the Overfitted Three-Layer Neural Tangent Kernel Model

    Authors: Peizhong Ju, Xiaojun Lin, Ness B. Shroff

    Abstract: In this paper, we study the generalization performance of overparameterized 3-layer NTK models. We show that, for a specific set of ground-truth functions (which we refer to as the "learnable set"), the test error of the overfitted 3-layer NTK is upper bounded by an expression that decreases with the number of neurons of the two hidden layers. Different from 2-layer NTK where there exists only one… ▽ More

    Submitted 4 June, 2022; originally announced June 2022.

  6. arXiv:2103.05243  [pdf, ps, other

    cs.LG math.ST stat.ML

    On the Generalization Power of Overfitted Two-Layer Neural Tangent Kernel Models

    Authors: Peizhong Ju, Xiaojun Lin, Ness B. Shroff

    Abstract: In this paper, we study the generalization performance of min $\ell_2$-norm overfitting solutions for the neural tangent kernel (NTK) model of a two-layer neural network with ReLU activation that has no bias term. We show that, depending on the ground-truth function, the test error of overfitted NTK models exhibits characteristics that are different from the "double-descent" of other overparameter… ▽ More

    Submitted 7 March, 2023; v1 submitted 9 March, 2021; originally announced March 2021.

    Comments: Published in ICML21. This version fixes an error of Lemma 31 and other parts affected by this error. The main results remain the same except some small changes on certain coefficients of Eq.(9)

  7. arXiv:2007.13023  [pdf, other

    cs.LG math.OC stat.ML

    A Partially Observable MDP Approach for Sequential Testing for Infectious Diseases such as COVID-19

    Authors: Rahul Singh, Fang Liu, Ness B. Shroff

    Abstract: The outbreak of the novel coronavirus (COVID-19) is unfolding as a major international crisis whose influence extends to every aspect of our daily lives. Effective testing allows infected individuals to be quarantined, thus reducing the spread of COVID-19, saving countless lives, and hel** to restart the economy safely and securely. Develo** a good testing strategy can be greatly aided by cont… ▽ More

    Submitted 25 July, 2020; originally announced July 2020.

  8. arXiv:2007.03133  [pdf, other

    cs.LG stat.ML

    The Sample Complexity of Best-$k$ Items Selection from Pairwise Comparisons

    Authors: Wenbo Ren, Jia Liu, Ness B. Shroff

    Abstract: This paper studies the sample complexity (aka number of comparisons) bounds for the active best-$k$ items selection from pairwise comparisons. From a given set of items, the learner can make pairwise comparisons on every pair of items, and each comparison returns an independent noisy result about the preferred item. At any time, the learner can adaptively choose a pair of items to compare accordin… ▽ More

    Submitted 29 July, 2021; v1 submitted 6 July, 2020; originally announced July 2020.

  9. arXiv:2007.03121  [pdf, other

    cs.LG cs.CR stat.ML

    Multi-Armed Bandits with Local Differential Privacy

    Authors: Wenbo Ren, Xingyu Zhou, Jia Liu, Ness B. Shroff

    Abstract: This paper investigates the problem of regret minimization for multi-armed bandit (MAB) problems with local differential privacy (LDP) guarantee. In stochastic bandit systems, the rewards may refer to the users' activities, which may involve private information and the users may not want the agent to know. However, in many cases, the agent needs to know these activities to provide better services… ▽ More

    Submitted 6 July, 2020; originally announced July 2020.

  10. arXiv:2006.03951  [pdf, other

    cs.LG stat.ML

    Contextual Bandits with Side-Observations

    Authors: Rahul Singh, Fang Liu, Xin Liu, Ness Shroff

    Abstract: We investigate contextual bandits in the presence of side-observations across arms in order to design recommendation algorithms for users connected via social networks. Users in social networks respond to their friends' activity, and hence provide information about each other's preferences. In our model, when a learning algorithm recommends an article to a user, not only does it observe his/her re… ▽ More

    Submitted 23 October, 2020; v1 submitted 6 June, 2020; originally announced June 2020.

    Comments: under review

  11. arXiv:2002.12435  [pdf, ps, other

    cs.LG math.OC stat.ML

    Learning in Markov Decision Processes under Constraints

    Authors: Rahul Singh, Abhishek Gupta, Ness B. Shroff

    Abstract: We consider reinforcement learning (RL) in Markov Decision Processes in which an agent repeatedly interacts with an environment that is modeled by a controlled Markov process. At each time step $t$, it earns a reward, and also incurs a cost-vector consisting of $M$ costs. We design model-based RL algorithms that maximize the cumulative reward earned over a time horizon of $T$ time-steps, while sim… ▽ More

    Submitted 5 January, 2022; v1 submitted 27 February, 2020; originally announced February 2020.

    Journal ref: IEEE Transactions on Control of Network Systems, 31 August 2022

  12. arXiv:1909.03194  [pdf, other

    cs.LG stat.ML

    On Sample Complexity Upper and Lower Bounds for Exact Ranking from Noisy Comparisons

    Authors: Wenbo Ren, Jia Liu, Ness B. Shroff

    Abstract: This paper studies the problem of finding the exact ranking from noisy comparisons. A comparison over a set of $m$ items produces a noisy outcome about the most preferred item, and reveals some information about the ranking. By repeatedly and adaptively choosing items to compare, we want to fully rank the items with a certain confidence, and use as few comparisons as possible. Different from most… ▽ More

    Submitted 29 July, 2021; v1 submitted 7 September, 2019; originally announced September 2019.

  13. arXiv:1905.06494  [pdf, other

    cs.LG cs.CR stat.ML

    Data Poisoning Attacks on Stochastic Bandits

    Authors: Fang Liu, Ness Shroff

    Abstract: Stochastic multi-armed bandits form a class of online learning problems that have important applications in online recommendation systems, adaptive medical treatment, and many others. Even though potential attacks against these learning algorithms may hijack their behavior, causing catastrophic loss in real-world applications, little is known about adversarial attacks on bandit algorithms. In this… ▽ More

    Submitted 15 May, 2019; originally announced May 2019.

    Comments: Accepted by ICML 2019

  14. arXiv:1810.11857  [pdf, ps, other

    cs.LG stat.ML

    Exploring $k$ out of Top $ρ$ Fraction of Arms in Stochastic Bandits

    Authors: Wenbo Ren, Jia Liu, Ness Shroff

    Abstract: This paper studies the problem of identifying any $k$ distinct arms among the top $ρ$ fraction (e.g., top 5\%) of arms from a finite or infinite set with a probably approximately correct (PAC) tolerance $ε$. We consider two cases: (i) when the threshold of the top arms' expected rewards is known and (ii) when it is unknown. We prove lower bounds for the four variants (finite or infinite arms, and… ▽ More

    Submitted 19 November, 2020; v1 submitted 28 October, 2018; originally announced October 2018.

  15. arXiv:1806.02970  [pdf, ps, other

    cs.LG stat.ML

    PAC Ranking from Pairwise and Listwise Queries: Lower Bounds and Upper Bounds

    Authors: Wenbo Ren, Jia Liu, Ness B. Shroff

    Abstract: This paper explores the adaptive (active) PAC (probably approximately correct) top-$k$ ranking (i.e., top-$k$ item selection) and total ranking problems from $l$-wise ($l\geq 2$) comparisons under the multinomial logit (MNL) model. By adaptively choosing sets to query and observing the noisy output of the most favored item of each query, we want to design ranking algorithms that recover the top-… ▽ More

    Submitted 9 September, 2018; v1 submitted 8 June, 2018; originally announced June 2018.

  16. arXiv:1805.08930  [pdf, other

    stat.ML cs.AI cs.LG

    Analysis of Thompson Sampling for Graphical Bandits Without the Graphs

    Authors: Fang Liu, Zizhan Zheng, Ness Shroff

    Abstract: We study multi-armed bandit problems with graph feedback, in which the decision maker is allowed to observe the neighboring actions of the chosen action, in a setting where the graph may vary over time and is never fully revealed to the decision maker. We show that when the feedback graphs are undirected, the original Thompson Sampling achieves the optimal (within logarithmic factors) regret… ▽ More

    Submitted 22 May, 2018; originally announced May 2018.

    Comments: Accepted by UAI 2018

  17. arXiv:1804.05929  [pdf, other

    cs.LG cs.AI stat.ML

    UCBoost: A Boosting Approach to Tame Complexity and Optimality for Stochastic Bandits

    Authors: Fang Liu, Sinong Wang, Swapna Buccapatnam, Ness Shroff

    Abstract: In this work, we address the open problem of finding low-complexity near-optimal multi-armed bandit algorithms for sequential decision making problems. Existing bandit algorithms are either sub-optimal and computationally simple (e.g., UCB1) or optimal and computationally complex (e.g., kl-UCB). We propose a boosting approach to Upper Confidence Bound based algorithms for stochastic bandits, that… ▽ More

    Submitted 16 April, 2018; originally announced April 2018.

    Comments: Accepted by IJCAI 2018

  18. arXiv:1711.03539  [pdf, other

    cs.LG cs.AI stat.ML

    A Change-Detection based Framework for Piecewise-stationary Multi-Armed Bandit Problem

    Authors: Fang Liu, Joohyun Lee, Ness Shroff

    Abstract: The multi-armed bandit problem has been extensively studied under the stationary assumption. However in reality, this assumption often does not hold because the distributions of rewards themselves may change over time. In this paper, we propose a change-detection (CD) based framework for multi-armed bandit problems under the piecewise-stationary setting, and study a class of change-detection based… ▽ More

    Submitted 20 November, 2017; v1 submitted 8 November, 2017; originally announced November 2017.

    Comments: accepted by AAAI 2018

  19. arXiv:1711.03198  [pdf, other

    cs.LG cs.AI stat.ML

    Information Directed Sampling for Stochastic Bandits with Graph Feedback

    Authors: Fang Liu, Swapna Buccapatnam, Ness Shroff

    Abstract: We consider stochastic multi-armed bandit problems with graph feedback, where the decision maker is allowed to observe the neighboring actions of the chosen action. We allow the graph structure to vary with time and consider both deterministic and Erdős-Rényi random graph models. For such a graph feedback model, we first present a novel analysis of Thompson sampling that leads to tighter performan… ▽ More

    Submitted 8 November, 2017; originally announced November 2017.

    Comments: Accepted by AAAI 2018

  20. arXiv:1704.07943  [pdf, other

    cs.LG stat.ML

    Reward Maximization Under Uncertainty: Leveraging Side-Observations on Networks

    Authors: Swapna Buccapatnam, Fang Liu, Atilla Eryilmaz, Ness B. Shroff

    Abstract: We study the stochastic multi-armed bandit (MAB) problem in the presence of side-observations across actions that occur as a result of an underlying network structure. In our model, a bipartite graph captures the relationship between actions and a common set of unknowns such that choosing an action reveals observations for the unknowns that it is connected to. This models a common scenario in onli… ▽ More

    Submitted 12 July, 2017; v1 submitted 25 April, 2017; originally announced April 2017.

    Comments: minor revision