Skip to main content

Showing 1–21 of 21 results for author: Jain, R

Searching in archive stat. Search in all archives.
.
  1. arXiv:2405.15090  [pdf, other

    cs.LG stat.ML

    Pure Exploration for Constrained Best Mixed Arm Identification with a Fixed Budget

    Authors: Dengwang Tang, Rahul Jain, Ashutosh Nayyar, Pierluigi Nuzzo

    Abstract: In this paper, we introduce the constrained best mixed arm identification (CBMAI) problem with a fixed budget. This is a pure exploration problem in a stochastic finite armed bandit model. Each arm is associated with a reward and multiple types of costs from unknown distributions. Unlike the unconstrained best arm identification problem, the optimal solution for the CBMAI problem may be a randomiz… ▽ More

    Submitted 23 May, 2024; originally announced May 2024.

    Comments: 7 pages, 5 figures, 1 table

  2. arXiv:2310.11531  [pdf, ps, other

    cs.LG cs.AI eess.SY stat.ML

    Efficient Online Learning with Offline Datasets for Infinite Horizon MDPs: A Bayesian Approach

    Authors: Dengwang Tang, Rahul Jain, Botao Hao, Zheng Wen

    Abstract: In this paper, we study the problem of efficient online reinforcement learning in the infinite horizon setting when there is an offline dataset to start with. We assume that the offline dataset is generated by an expert but with unknown level of competence, i.e., it is not perfect and not necessarily using the optimal policy. We show that if the learning agent models the behavioral policy (paramet… ▽ More

    Submitted 1 February, 2024; v1 submitted 17 October, 2023; originally announced October 2023.

    Comments: 22 pages

    MSC Class: 93E35

  3. arXiv:2310.10107  [pdf, other

    cs.LG cs.AI eess.SY stat.ML

    Posterior Sampling-based Online Learning for Episodic POMDPs

    Authors: Dengwang Tang, Dongze Ye, Rahul Jain, Ashutosh Nayyar, Pierluigi Nuzzo

    Abstract: Learning in POMDPs is known to be significantly harder than MDPs. In this paper, we consider the online learning problem for episodic POMDPs with unknown transition and observation models. We propose a Posterior Sampling-based reinforcement learning algorithm for POMDPs (PS4POMDPs), which is much simpler and more implementable compared to state-of-the-art optimism-based online learning algorithms… ▽ More

    Submitted 23 May, 2024; v1 submitted 16 October, 2023; originally announced October 2023.

    Comments: 32 pages, 4 figures

    MSC Class: 93E35

  4. arXiv:2302.03319  [pdf, ps, other

    cs.LG math.ST stat.ML

    Leveraging Demonstrations to Improve Online Learning: Quality Matters

    Authors: Botao Hao, Rahul Jain, Tor Lattimore, Benjamin Van Roy, Zheng Wen

    Abstract: We investigate the extent to which offline demonstration data can improve online learning. It is natural to expect some improvement, but the question is how, and by how much? We show that the degree of improvement must depend on the quality of the demonstration data. To generate portable insights, we focus on Thompson sampling (TS) applied to a multi-armed bandit as a prototypical online learning… ▽ More

    Submitted 17 May, 2023; v1 submitted 7 February, 2023; originally announced February 2023.

    Comments: Accepted at ICML 2023

  5. Invisible Walls: Exploration of Microclimate Effects on Building Energy Consumption in New York City

    Authors: Thomas Dougherty, Rishee Jain

    Abstract: The reduction of greenhouse gases from buildings forms the cornerstone of policy to mitigate the effects of climate change. However, the automation of urban scale building energy modeling systems required to meet global urban demand has proven challenging due to the bespoke characteristics of each city. One such point of uniqueness between cities is that of urban microclimate, which may play a maj… ▽ More

    Submitted 5 August, 2022; originally announced August 2022.

    Comments: 16 pages, 18 figures

  6. arXiv:2110.08710  [pdf, other

    cs.LG cs.LO cs.PL stat.ML

    NeuralArTS: Structuring Neural Architecture Search with Type Theory

    Authors: Robert Wu, Nayan Saxena, Rohan Jain

    Abstract: Neural Architecture Search (NAS) algorithms automate the task of finding optimal deep learning architectures given an initial search space of possible operations. Develo** these search spaces is usually a manual affair with pre-optimized search spaces being more efficient, rather than searching from scratch. In this paper we present a new framework called Neural Architecture Type System (NeuralA… ▽ More

    Submitted 5 November, 2021; v1 submitted 16 October, 2021; originally announced October 2021.

    Comments: (Student Abstract) In Proceedings of the 36th AAAI Conference on Artificial Intelligence, Vancouver, BC,Canada, 2022

  7. arXiv:2106.14406  [pdf, other

    cs.LG cs.CR cs.NE stat.ML

    Poisoning the Search Space in Neural Architecture Search

    Authors: Robert Wu, Nayan Saxena, Rohan Jain

    Abstract: Deep learning has proven to be a highly effective problem-solving tool for object detection and image segmentation across various domains such as healthcare and autonomous driving. At the heart of this performance lies neural architecture design which relies heavily on domain knowledge and prior experience on the researchers' behalf. More recently, this process of finding the most optimal architec… ▽ More

    Submitted 28 June, 2021; originally announced June 2021.

    Comments: All authors contributed equally. Appears in AdvML Workshop @ ICML2021: A Blessing in Disguise: The Prospects and Perils of Adversarial Machine Learning

  8. arXiv:2106.08377  [pdf, other

    cs.LG stat.ML

    Implicit Finite-Horizon Approximation and Efficient Optimal Algorithms for Stochastic Shortest Path

    Authors: Liyu Chen, Mehdi Jafarnia-Jahromi, Rahul Jain, Haipeng Luo

    Abstract: We introduce a generic template for develo** regret minimization algorithms in the Stochastic Shortest Path (SSP) model, which achieves minimax optimal regret as long as certain properties are ensured. The key of our analysis is a new technique called implicit finite-horizon approximation, which approximates the SSP model by a finite-horizon counterpart only in the analysis without explicit impl… ▽ More

    Submitted 9 November, 2021; v1 submitted 15 June, 2021; originally announced June 2021.

  9. arXiv:2009.11348  [pdf, ps, other

    cs.LG cs.AI eess.SY stat.ML

    A Sample-Efficient Algorithm for Episodic Finite-Horizon MDP with Constraints

    Authors: Krishna C. Kalagarla, Rahul Jain, Pierluigi Nuzzo

    Abstract: Constrained Markov Decision Processes (CMDPs) formalize sequential decision-making problems whose objective is to minimize a cost function while satisfying constraints on various cost functions. In this paper, we consider the setting of episodic fixed-horizon CMDPs. We propose an online algorithm which leverages the linear programming formulation of finite-horizon CMDP for repeated optimistic plan… ▽ More

    Submitted 23 September, 2020; originally announced September 2020.

  10. arXiv:2007.11849  [pdf, other

    cs.LG stat.ML

    Learning Infinite-horizon Average-reward MDPs with Linear Function Approximation

    Authors: Chen-Yu Wei, Mehdi Jafarnia-Jahromi, Haipeng Luo, Rahul Jain

    Abstract: We develop several new algorithms for learning Markov Decision Processes in an infinite-horizon average-reward setting with linear function approximation. Using the optimism principle and assuming that the MDP has a linear structure, we first propose a computationally inefficient algorithm with optimal $\widetilde{O}(\sqrt{T})$ regret and another computationally efficient variant with… ▽ More

    Submitted 26 April, 2021; v1 submitted 23 July, 2020; originally announced July 2020.

  11. arXiv:2006.04354   

    cs.LG stat.ML

    A Model-free Learning Algorithm for Infinite-horizon Average-reward MDPs with Near-optimal Regret

    Authors: Mehdi Jafarnia-Jahromi, Chen-Yu Wei, Rahul Jain, Haipeng Luo

    Abstract: Recently, model-free reinforcement learning has attracted research attention due to its simplicity, memory and computation efficiency, and the flexibility to combine with function approximation. In this paper, we propose Exploration Enhanced Q-learning (EE-QL), a model-free algorithm for infinite-horizon average-reward Markov Decision Processes (MDPs) that achieves regret bound of $O(\sqrt{T})$ fo… ▽ More

    Submitted 8 December, 2020; v1 submitted 8 June, 2020; originally announced June 2020.

    Comments: There is a bug in the proof of Lemma 2 that is not easily fixable

  12. arXiv:2006.04331  [pdf, other

    cs.LG cs.AI stat.ML

    Randomized Policy Learning for Continuous State and Action MDPs

    Authors: Hiteshi Sharma, Rahul Jain

    Abstract: Deep reinforcement learning methods have achieved state-of-the-art results in a variety of challenging, high-dimensional domains ranging from video games to locomotion. The key to success has been the use of deep neural networks used to approximate the policy and value function. Yet, substantial tuning of weights is required for good results. We instead use randomized function approximation. Such… ▽ More

    Submitted 15 November, 2020; v1 submitted 7 June, 2020; originally announced June 2020.

  13. arXiv:1910.07072  [pdf, ps, other

    cs.LG cs.AI stat.ML

    Model-free Reinforcement Learning in Infinite-horizon Average-reward Markov Decision Processes

    Authors: Chen-Yu Wei, Mehdi Jafarnia-Jahromi, Haipeng Luo, Hiteshi Sharma, Rahul Jain

    Abstract: Model-free reinforcement learning is known to be memory and computation efficient and more amendable to large scale problems. In this paper, two model-free algorithms are introduced for learning infinite-horizon average-reward Markov Decision Processes (MDPs). The first algorithm reduces the problem to the discounted-reward version and achieves $\mathcal{O}(T^{2/3})$ regret after $T$ steps, under… ▽ More

    Submitted 25 February, 2020; v1 submitted 15 October, 2019; originally announced October 2019.

  14. arXiv:1909.10447  [pdf, other

    cs.LG cs.CL cs.NE stat.ML

    On Model Stability as a Function of Random Seed

    Authors: Pranava Madhyastha, Rishabh Jain

    Abstract: In this paper, we focus on quantifying model stability as a function of random seed by investigating the effects of the induced randomness on model performance and the robustness of the model in general. We specifically perform a controlled study on the effect of random seeds on the behaviour of attention, gradient-based and surrogate model based (LIME) interpretations. Our analysis suggests that… ▽ More

    Submitted 23 September, 2019; originally announced September 2019.

    Comments: v1; Accepted for publication at CoNLL 2019

  15. arXiv:1907.01367  [pdf, other

    eess.AS cs.LG cs.SD stat.ML

    Lipper: Synthesizing Thy Speech using Multi-View Lipreading

    Authors: Yaman Kumar, Rohit Jain, Khwaja Mohd. Salik, Rajiv Ratn Shah, Yifang yin, Roger Zimmermann

    Abstract: Lipreading has a lot of potential applications such as in the domain of surveillance and video conferencing. Despite this, most of the work in building lipreading systems has been limited to classifying silent videos into classes representing text phrases. However, there are multiple problems associated with making lipreading a text-based classification task like its dependence on a particular lan… ▽ More

    Submitted 28 June, 2019; originally announced July 2019.

    Comments: Accepted at AAAI 2019

  16. arXiv:1906.06968  [pdf

    cs.LG cs.CR stat.ML

    Scrubbing Sensitive PHI Data from Medical Records made Easy by SpaCy -- A Scalable Model Implementation Comparisons

    Authors: Rashmi Jain, Dinah Samuel Anand, Vijayalakshmi Janakiraman

    Abstract: De-identification of clinical records is an extremely important process which enables the use of the wealth of information present in them. There are a lot of techniques available for this but none of the method implementation has evaluated the scalability, which is an important benchmark. We evaluated numerous deep learning techniques such as BiLSTM-CNN, IDCNN, CRF, BiLSTM-CRF, SpaCy, etc. on bot… ▽ More

    Submitted 17 June, 2019; originally announced June 2019.

    Comments: 9 Pages, 7 Figures, 2 Tables

    ACM Class: I.2.7; I.5.4

  17. arXiv:1902.05069  [pdf, other

    cs.SD cs.LG eess.AS stat.ML

    Improving performance and inference on audio classification tasks using capsule networks

    Authors: Royal Jain

    Abstract: Classification of audio samples is an important part of many auditory systems. Deep learning models based on the Convolutional and the Recurrent layers are state-of-the-art in many such tasks. In this paper, we approach audio classification tasks using capsule networks trained by recently proposed dynamic routing-by-agreement mechanism. We propose an architecture for capsule networks fit for audio… ▽ More

    Submitted 13 February, 2019; originally announced February 2019.

  18. arXiv:1812.05443  [pdf

    cs.CR cs.LG stat.ML

    Machine Learning for Anomaly Detection and Categorization in Multi-cloud Environments

    Authors: Tara Salman, Deval Bhamare, Aiman Erbad, Raj Jain, Mohammed Samaka

    Abstract: Recently, advances in machine learning techniques have attracted the attention of the research community to build intrusion detection systems (IDS) that can detect anomalies in the network traffic. Most of the research works, however, do not differentiate among different types of attacks. This is, in fact, necessary for appropriate countermeasures and defense against attacks. In this paper, we inv… ▽ More

    Submitted 23 October, 2018; originally announced December 2018.

    Comments: CSCLoud17

    Journal ref: CSCLOUD 2017

  19. Feasibility of Supervised Machine Learning for Cloud Security

    Authors: Deval Bhamare, Tara Salman, Mohammed Samaka, Aiman Erbad, Raj Jain

    Abstract: Cloud computing is gaining significant attention, however, security is the biggest hurdle in its wide acceptance. Users of cloud services are under constant fear of data loss, security threats and availability issues. Recently, learning-based methods for security applications are gaining popularity in the literature with the advents in machine learning techniques. However, the major challenge in t… ▽ More

    Submitted 23 October, 2018; originally announced October 2018.

    Journal ref: 2016 International Conference on Information Science and Security (ICISS)

  20. arXiv:1702.04415  [pdf, other

    cs.LG stat.ML

    Small Boxes Big Data: A Deep Learning Approach to Optimize Variable Sized Bin Packing

    Authors: Feng Mao, Edgar Blanco, Mingang Fu, Rohit Jain, Anurag Gupta, Sebastien Mancel, Rong Yuan, Stephen Guo, Sai Kumar, Yayang Tian

    Abstract: Bin Packing problems have been widely studied because of their broad applications in different domains. Known as a set of NP-hard problems, they have different vari- ations and many heuristics have been proposed for obtaining approximate solutions. Specifically, for the 1D variable sized bin packing problem, the two key sets of optimization heuristics are the bin assignment and the bin allocation.… ▽ More

    Submitted 14 February, 2017; originally announced February 2017.

    Comments: The Third IEEE International Conference on Big Data Computing Service and Applications, 2017

    ACM Class: I.1.2; I.2.8

  21. arXiv:1505.00553  [pdf, other

    stat.ML cs.LG

    On Regret-Optimal Learning in Decentralized Multi-player Multi-armed Bandits

    Authors: Naumaan Nayyar, Dileep Kalathil, Rahul Jain

    Abstract: We consider the problem of learning in single-player and multiplayer multiarmed bandit models. Bandit problems are classes of online learning problems that capture exploration versus exploitation tradeoffs. In a multiarmed bandit model, players can pick among many arms, and each play of an arm generates an i.i.d. reward from an unknown distribution. The objective is to design a policy that maximiz… ▽ More

    Submitted 1 December, 2016; v1 submitted 4 May, 2015; originally announced May 2015.