Skip to main content

Showing 1–19 of 19 results for author: Mehta, N A

.
  1. arXiv:2404.05155  [pdf, other

    cs.LG cs.GT stat.ML

    On the price of exact truthfulness in incentive-compatible online learning with bandit feedback: A regret lower bound for WSU-UX

    Authors: Ali Mortazavi, Junhao Lin, Nishant A. Mehta

    Abstract: In one view of the classical game of prediction with expert advice with binary outcomes, in each round, each expert maintains an adversarially chosen belief and honestly reports this belief. We consider a recently introduced, strategic variant of this problem with selfish (reputation-seeking) experts, where each expert strategically reports in order to maximize their expected future reputation bas… ▽ More

    Submitted 7 April, 2024; originally announced April 2024.

    Comments: Accepted to AISTATS 2024

  2. arXiv:2403.01315  [pdf, ps, other

    cs.LG stat.ML

    Near-optimal Per-Action Regret Bounds for Slee** Bandits

    Authors: Quan Nguyen, Nishant A. Mehta

    Abstract: We derive near-optimal per-action regret bounds for slee** bandits, in which both the sets of available arms and their losses in every round are chosen by an adversary. In a setting with $K$ total arms and at most $A$ available arms in each round over $T$ rounds, the best known upper bound is $O(K\sqrt{TA\ln{K}})$, obtained indirectly via minimizing internal slee** regrets. Compared to the min… ▽ More

    Submitted 29 May, 2024; v1 submitted 2 March, 2024; originally announced March 2024.

    Comments: V2: corrected Theorem 8 (FTARL's high probability bound) from log(1/delta) to log(K/delta)

  3. arXiv:2305.04093  [pdf, other

    cs.LG

    An improved regret analysis for UCB-N and TS-N

    Authors: Nishant A. Mehta

    Abstract: In the setting of stochastic online learning with undirected feedback graphs, Lykouris et al. (2020) previously analyzed the pseudo-regret of the upper confidence bound-based algorithm UCB-N and the Thompson Sampling-based algorithm TS-N. In this note, we show how to improve their pseudo-regret analysis. Our improvement involves refining a key lemma of the previous analysis, allowing a $\log(T)$ f… ▽ More

    Submitted 6 May, 2023; originally announced May 2023.

    Comments: 5 pages

  4. arXiv:2301.04268  [pdf, other

    cs.LG cs.AI stat.ML

    Adversarial Online Multi-Task Reinforcement Learning

    Authors: Quan Nguyen, Nishant A. Mehta

    Abstract: We consider the adversarial online multi-task reinforcement learning setting, where in each of $K$ episodes the learner is given an unknown task taken from a finite set of $M$ unknown finite-horizon MDP models. The learner's objective is to minimize its regret with respect to the optimal policy for each task. We assume the MDPs in $\mathcal{M}$ are well-separated under a notion of $λ$-separability… ▽ More

    Submitted 10 January, 2023; originally announced January 2023.

    Comments: To appear at the 34th International Conference on Algorithmic Learning Theory (ALT 2023)

  5. arXiv:2106.12688  [pdf, other

    cs.LG

    Best-Case Lower Bounds in Online Learning

    Authors: Cristóbal Guzmán, Nishant A. Mehta, Ali Mortazavi

    Abstract: Much of the work in online learning focuses on the study of sublinear upper bounds on the regret. In this work, we initiate the study of best-case lower bounds in online convex optimization, wherein we bound the largest improvement an algorithm can obtain relative to the single best action in hindsight. This problem is motivated by the goal of better understanding the adaptivity of a learning algo… ▽ More

    Submitted 23 June, 2021; originally announced June 2021.

    Comments: 28 pages

  6. arXiv:2102.07929  [pdf, other

    cs.LG

    Near-Optimal Algorithms for Differentially Private Online Learning in a Stochastic Environment

    Authors: Bingshan Hu, Zhiming Huang, Nishant A. Mehta, Nidhi Hegde

    Abstract: In this paper, we study differentially private online learning problems in a stochastic environment under both bandit and full information feedback. For differentially private stochastic bandits, we propose both UCB and Thompson Sampling-based algorithms that are anytime and achieve the optimal $O \left(\sum_{j: Δ_j>0} \frac{\ln(T)}{\min \left\{Δ_j, ε\right\}} \right)$ instance-dependent regret bo… ▽ More

    Submitted 30 May, 2024; v1 submitted 15 February, 2021; originally announced February 2021.

    Comments: 40 pages. New in v3: (i) Removed Hybrid-UCB (although its analysis is correct to our knowledge); (ii) Added Lazy-DP-TS from UAI 2022 paper of Hu and Hegde (2022)

  7. arXiv:2003.03456  [pdf, other

    cs.LG cs.AI stat.ML

    A Farewell to Arms: Sequential Reward Maximization on a Budget with a Giving Up Option

    Authors: P Sharoff, Nishant A. Mehta, Ravi Ganti

    Abstract: We consider a sequential decision-making problem where an agent can take one action at a time and each action has a stochastic temporal extent, i.e., a new action cannot be taken until the previous one is finished. Upon completion, the chosen action yields a stochastic reward. The agent seeks to maximize its cumulative reward over a finite time budget, with the option of "giving up" on a current a… ▽ More

    Submitted 6 March, 2020; originally announced March 2020.

    Comments: 16 pages, AISTATS 2020

  8. arXiv:1910.13521  [pdf, other

    cs.LG stat.ML

    Dying Experts: Efficient Algorithms with Optimal Regret Bounds

    Authors: Hamid Shayestehmanesh, Sajjad Azami, Nishant A. Mehta

    Abstract: We study a variant of decision-theoretic online learning in which the set of experts that are available to Learner can shrink over time. This is a restricted version of the well-studied slee** experts problem, itself a generalization of the fundamental game of prediction with expert advice. Similar to many works in this direction, our benchmark is the ranking regret. Various results suggest that… ▽ More

    Submitted 29 October, 2019; originally announced October 2019.

    Comments: 18 Pages, NeurIPS 2019

  9. arXiv:1802.09680  [pdf, other

    cs.LG

    Multi-Observation Regression

    Authors: Rafael Frongillo, Nishant A. Mehta, Tom Morgan, Bo Waggoner

    Abstract: Recent work introduced loss functions which measure the error of a prediction based on multiple simultaneous observations or outcomes. In this paper, we explore the theoretical and practical questions that arise when using such multi-observation losses for regression on data sets of $(x,y)$ pairs. When a loss depends on only one observation, the average empirical loss decomposes by applying the lo… ▽ More

    Submitted 26 February, 2018; originally announced February 2018.

    Comments: 28 pages

  10. arXiv:1710.07732  [pdf, other

    cs.LG stat.ML

    A Tight Excess Risk Bound via a Unified PAC-Bayesian-Rademacher-Shtarkov-MDL Complexity

    Authors: Peter D. Grünwald, Nishant A. Mehta

    Abstract: We present a novel notion of complexity that interpolates between and generalizes some classic existing complexity notions in learning theory: for estimators like empirical risk minimization (ERM) with arbitrary bounded losses, it is upper bounded in terms of data-independent Rademacher complexity; for generalized Bayesian estimators, it is upper bounded by the data-dependent information complexit… ▽ More

    Submitted 20 October, 2017; originally announced October 2017.

    Comments: 38 pages

  11. arXiv:1609.03319  [pdf, other

    cs.LG stat.ML

    CompAdaGrad: A Compressed, Complementary, Computationally-Efficient Adaptive Gradient Method

    Authors: Nishant A. Mehta, Alistair Rendell, Anish Varghese, Christfried Webers

    Abstract: The adaptive gradient online learning method known as AdaGrad has seen widespread use in the machine learning community in stochastic and adversarial online learning problems and more recently in deep learning methods. The method's full-matrix incarnation offers much better theoretical guarantees and potentially better empirical performance than its diagonal version; however, this version is compu… ▽ More

    Submitted 4 October, 2016; v1 submitted 12 September, 2016; originally announced September 2016.

    Comments: only updated acknowledgements

  12. arXiv:1605.01288  [pdf, other

    cs.LG

    Fast rates with high probability in exp-concave statistical learning

    Authors: Nishant A. Mehta

    Abstract: We present an algorithm for the statistical learning setting with a bounded exp-concave loss in $d$ dimensions that obtains excess risk $O(d \log(1/δ)/n)$ with probability at least $1 - δ$. The core technique is to boost the confidence of recent in-expectation $O(d/n)$ excess risk bounds for empirical risk minimization (ERM), without sacrificing the rate, by leveraging a Bernstein condition which… ▽ More

    Submitted 14 October, 2016; v1 submitted 4 May, 2016; originally announced May 2016.

    Comments: added results on model selection aggregation (Section 7)

  13. arXiv:1605.00252  [pdf, other

    cs.LG stat.ML

    Fast Rates for General Unbounded Loss Functions: from ERM to Generalized Bayes

    Authors: Peter D. Grünwald, Nishant A. Mehta

    Abstract: We present new excess risk bounds for general unbounded loss functions including log loss and squared loss, where the distribution of the losses may be heavy-tailed. The bounds hold for general estimators, but they are optimized when applied to $η$-generalized Bayesian, MDL, and empirical risk minimization estimators. In the case of log loss, the bounds imply convergence rates for generalized Baye… ▽ More

    Submitted 5 November, 2019; v1 submitted 1 May, 2016; originally announced May 2016.

    Comments: accepted to JMLR pending minor final modifications

  14. arXiv:1507.02592  [pdf, other

    cs.LG stat.ML

    Fast rates in statistical and online learning

    Authors: Tim van Erven, Peter D. Grünwald, Nishant A. Mehta, Mark D. Reid, Robert C. Williamson

    Abstract: The speed with which a learning algorithm converges as it is presented with more data is a central problem in machine learning --- a fast rate of convergence means less data is needed for the same level of performance. The pursuit of fast rates in online and statistical learning has led to the discovery of many conditions in learning theory under which fast learning is possible. We show that most… ▽ More

    Submitted 1 September, 2015; v1 submitted 9 July, 2015; originally announced July 2015.

    Comments: 69 pages, 3 figures

    Journal ref: Journal of Machine Learning Research 6(54):1793-1861, 2015

  15. arXiv:1406.3781  [pdf, other

    cs.LG stat.ML

    From Stochastic Mixability to Fast Rates

    Authors: Nishant A. Mehta, Robert C. Williamson

    Abstract: Empirical risk minimization (ERM) is a fundamental learning rule for statistical learning problems where the data is generated according to some unknown distribution $\mathsf{P}$ and returns a hypothesis $f$ chosen from a fixed class $\mathcal{F}$ with small loss $\ell$. In the parametric setting, depending upon $(\ell, \mathcal{F},\mathsf{P})$ ERM can have slow $(1/\sqrt{n})$ or fast $(1/n)$ rate… ▽ More

    Submitted 22 November, 2014; v1 submitted 14 June, 2014; originally announced June 2014.

    Comments: 21 pages, accepted to NIPS 2014

  16. arXiv:1210.6293  [pdf, ps, other

    cs.MS cs.CV cs.LG

    MLPACK: A Scalable C++ Machine Learning Library

    Authors: Ryan R. Curtin, James R. Cline, N. P. Slagle, William B. March, Parikshit Ram, Nishant A. Mehta, Alexander G. Gray

    Abstract: MLPACK is a state-of-the-art, scalable, multi-platform C++ machine learning library released in late 2011 offering both a simple, consistent API accessible to novice users and high performance and flexibility to expert users by leveraging modern features of C++. MLPACK provides cutting-edge algorithms whose benchmarks exhibit far better performance than other leading machine learning libraries. ML… ▽ More

    Submitted 23 October, 2012; originally announced October 2012.

    Comments: Submitted to JMLR MLOSS (http://jmlr.csail.mit.edu/mloss/)

    Journal ref: Journal of Machine Learning Research 14 (2013) 801-805

  17. arXiv:1209.2784  [pdf, other

    cs.LG stat.ML

    Minimax Multi-Task Learning and a Generalized Loss-Compositional Paradigm for MTL

    Authors: Nishant A. Mehta, Dongryeol Lee, Alexander G. Gray

    Abstract: Since its inception, the modus operandi of multi-task learning (MTL) has been to minimize the task-wise mean of the empirical risks. We introduce a generalized loss-compositional paradigm for MTL that includes a spectrum of formulations as a subfamily. One endpoint of this spectrum is minimax MTL: a new MTL formulation that minimizes the maximum of the tasks' empirical risks. Via a certain relaxat… ▽ More

    Submitted 13 September, 2012; originally announced September 2012.

    Comments: appearing at NIPS 2012

  18. arXiv:1202.4050  [pdf, other

    cs.LG stat.ML

    On the Sample Complexity of Predictive Sparse Coding

    Authors: Nishant A. Mehta, Alexander G. Gray

    Abstract: The goal of predictive sparse coding is to learn a representation of examples as sparse linear combinations of elements from a dictionary, such that a learned hypothesis linear in the new representation performs well on a predictive task. Predictive sparse coding algorithms recently have demonstrated impressive performance on a variety of supervised tasks, but their generalization properties have… ▽ More

    Submitted 7 October, 2012; v1 submitted 17 February, 2012; originally announced February 2012.

    Comments: Sparse Coding Stability Theorem from version 1 has been relaxed considerably using a new notion of coding margin. Old Sparse Coding Stability Theorem still in new version, now as Theorem 2. Presentation of all proofs simplified/improved considerably. Paper reorganized. Empirical analysis showing new coding margin is non-trivial on real datasets

  19. arXiv:1005.0188  [pdf, other

    cs.LG stat.ML

    Generative and Latent Mean Map Kernels

    Authors: Nishant A. Mehta, Alexander G. Gray

    Abstract: We introduce two kernels that extend the mean map, which embeds probability measures in Hilbert spaces. The generative mean map kernel (GMMK) is a smooth similarity measure between probabilistic models. The latent mean map kernel (LMMK) generalizes the non-iid formulation of Hilbert space embeddings of empirical distributions in order to incorporate latent variable models. When comparing certain c… ▽ More

    Submitted 3 May, 2010; originally announced May 2010.

    Comments: 16 pages, 1 figure, 1 table