Skip to main content

Showing 1–6 of 6 results for author: Ishfaq, H

.
  1. arXiv:2406.12241  [pdf, other

    cs.LG cs.AI

    More Efficient Randomized Exploration for Reinforcement Learning via Approximate Sampling

    Authors: Haque Ishfaq, Yixin Tan, Yu Yang, Qingfeng Lan, Jianfeng Lu, A. Rupam Mahmood, Doina Precup, Pan Xu

    Abstract: Thompson sampling (TS) is one of the most popular exploration techniques in reinforcement learning (RL). However, most TS algorithms with theoretical guarantees are difficult to implement and not generalizable to Deep RL. While the emerging approximate sampling-based exploration schemes are promising, most existing algorithms are specific to linear Markov Decision Processes (MDP) with suboptimal r… ▽ More

    Submitted 17 June, 2024; originally announced June 2024.

    Comments: First two authors contributed equally. Accepted to the Reinforcement Learning Conference (RLC) 2024

  2. arXiv:2403.11574  [pdf, ps, other

    cs.LG

    Offline Multitask Representation Learning for Reinforcement Learning

    Authors: Haque Ishfaq, Thanh Nguyen-Tang, Songtao Feng, Raman Arora, Mengdi Wang, Ming Yin, Doina Precup

    Abstract: We study offline multitask representation learning in reinforcement learning (RL), where a learner is provided with an offline dataset from different tasks that share a common representation and is asked to learn the shared representation. We theoretically investigate offline multitask low-rank RL, and propose a new algorithm called MORL for offline multitask representation learning. Furthermore,… ▽ More

    Submitted 18 March, 2024; originally announced March 2024.

  3. arXiv:2305.18246  [pdf, other

    cs.LG

    Provable and Practical: Efficient Exploration in Reinforcement Learning via Langevin Monte Carlo

    Authors: Haque Ishfaq, Qingfeng Lan, Pan Xu, A. Rupam Mahmood, Doina Precup, Anima Anandkumar, Kamyar Azizzadenesheli

    Abstract: We present a scalable and effective exploration strategy based on Thompson sampling for reinforcement learning (RL). One of the key shortcomings of existing Thompson sampling algorithms is the need to perform a Gaussian approximation of the posterior distribution, which is not a good surrogate in most practical settings. We instead directly sample the Q function from its posterior distribution, by… ▽ More

    Submitted 17 March, 2024; v1 submitted 29 May, 2023; originally announced May 2023.

    Comments: Published in The Twelfth International Conference on Learning Representations (ICLR) 2024

  4. arXiv:2106.07841  [pdf, other

    cs.LG stat.ML

    Randomized Exploration for Reinforcement Learning with General Value Function Approximation

    Authors: Haque Ishfaq, Qiwen Cui, Viet Nguyen, Alex Ayoub, Zhuoran Yang, Zhaoran Wang, Doina Precup, Lin F. Yang

    Abstract: We propose a model-free reinforcement learning algorithm inspired by the popular randomized least squares value iteration (RLSVI) algorithm as well as the optimism principle. Unlike existing upper-confidence-bound (UCB) based approaches, which are often computationally intractable, our algorithm drives exploration by simply perturbing the training data with judiciously chosen i.i.d. scalar noises.… ▽ More

    Submitted 25 October, 2021; v1 submitted 14 June, 2021; originally announced June 2021.

    Comments: 32 page, 5 figures, in Proceedings of the 38th International Conference on Machine Learning, PMLR 139, 2021

  5. arXiv:1911.02085  [pdf, other

    cs.AI cs.CL

    Path-Based Contextualization of Knowledge Graphs for Textual Entailment

    Authors: Kshitij Fadnis, Kartik Talamadupula, Pavan Kapanipathi, Haque Ishfaq, Salim Roukos, Achille Fokoue

    Abstract: In this paper, we introduce the problem of knowledge graph contextualization -- that is, given a specific NLP task, the problem of extracting meaningful and relevant sub-graphs from a given knowledge graph. The task in the case of this paper is the textual entailment problem, and the context is a relevant sub-graph for an instance of the textual entailment problem -- where given two sentences p an… ▽ More

    Submitted 3 February, 2020; v1 submitted 5 November, 2019; originally announced November 2019.

  6. arXiv:1802.04403  [pdf, other

    stat.ML cs.AI cs.CV cs.LG

    TVAE: Triplet-Based Variational Autoencoder using Metric Learning

    Authors: Haque Ishfaq, Assaf Hoogi, Daniel Rubin

    Abstract: Deep metric learning has been demonstrated to be highly effective in learning semantic representation and encoding information that can be used to measure data similarity, by relying on the embedding learned from metric learning. At the same time, variational autoencoder (VAE) has widely been used to approximate inference and proved to have a good performance for directed probabilistic models. How… ▽ More

    Submitted 8 February, 2023; v1 submitted 12 February, 2018; originally announced February 2018.

    Comments: Old technical note

    MSC Class: 68T30 (Primary); 68T01 (Secondary)